BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:Europe/Stockholm
X-LIC-LOCATION:Europe/Stockholm
BEGIN:DAYLIGHT
TZOFFSETFROM:+0100
TZOFFSETTO:+0200
TZNAME:CEST
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=-1SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:+0200
TZOFFSETTO:+0100
TZNAME:CET
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=10;BYDAY=-1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20220812T074357Z
LOCATION:Osaka Room
DTSTART;TZID=Europe/Stockholm:20220627T160000
DTEND;TZID=Europe/Stockholm:20220627T180000
UID:submissions.pasc-conference.org_PASC22_sess137@linklings.com
SUMMARY:MS2A - Leveraging Data Lakes to Manage and Process Scientific Data
DESCRIPTION:Minisymposium\n\nIn recent years, data lakes have become incre
 asingly popular as central storage, particularly for unstructured data. Ge
 nerally, data lakes aim to integrate heterogeneous data from diverse sourc
 es into a unified information management system, where data is retained in
  its original format. Storing data in raw format, opposed to inferring a s
 chema on write as it is commonly done in a data warehouse, supports the re
 use and sharing of already collected data. The idea is to basically dump t
 he data into the lake and later fish for knowledge using sophisticated ana
 lysis tools. This approach, however, is quite challenging since it has to 
 be ensured that all data, no matter the number or size of the different da
 ta sets, will be found and can be accessed later on. In addition, especial
 ly for domain researchers in public research institutions, a research data
  management solution should not only ensure the preservation of the data b
 ut also support and guide scientists in complying with good scientific pra
 ctices from the very beginning. In order to discuss the current challenges
 , their possible solutions and share personal insights into data lakes, we
  bring different experts together and discuss with the scientific communit
 y the potential and technical approaches.\n\nData Integration in Data Lake
 s\n\nHai\n\nAlthough big data is being discussed for some years, it still 
 has many research challenges, such as the variety of data. The diversity o
 f data sources often exists in information silos, which are a collection o
 f non-integrated data management systems with heterogeneous schemas, query
  languages, and ...\n\n---------------------\nA FAIR Digital Object-Based 
 Data Lake Architecture to Support Various User Groups and Scientific Domai
 ns\n\nNolte\n\nAcross various domains, data lakes are successfully utilize
 d to centrally store all data of an organization in their raw format. This
  promises a high reusability of the stored data since a schema is implied 
 on read, which prevents an information loss due to ETL (Extract, Transform
 , Load) processes. ...\n\n---------------------\nEnabling Industrialized A
 nalysis of Textual Documents in Data Lakes\n\nSawadogo\n\nThe concept of d
 ata lake was introduced in 2010 by James Dixon as an alternative to data w
 arehouses for big data analysis and management. Unlike data warehouses, da
 ta lakes follow a schema-on-read approach to better support ad’hoc analyse
 s. In the absence of a fixed schema, data from the lake ...\n\n-----------
 ----------\nUtilizing Data Lakes for Managing Multidisciplinary Research D
 ata\n\nGreiner\n\nScientific research institutes face a lot of the same ch
 allenges as commercial organizations when it comes to managing data. Just 
 like for commercial organizations, a common situation is data silos, or ev
 en wors, data swamps. The fundamental problem is that the continual manual
  effort needed to gove...\n\n\nDomain: Computer Science and Applied Mathem
 atics
END:VEVENT
END:VCALENDAR
