BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:Europe/Stockholm
X-LIC-LOCATION:Europe/Stockholm
BEGIN:DAYLIGHT
TZOFFSETFROM:+0100
TZOFFSETTO:+0200
TZNAME:CEST
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=-1SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:+0200
TZOFFSETTO:+0100
TZNAME:CET
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=10;BYDAY=-1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20220812T074334Z
LOCATION:Osaka Room
DTSTART;TZID=Europe/Stockholm:20220627T170000
DTEND;TZID=Europe/Stockholm:20220627T173000
UID:submissions.pasc-conference.org_PASC22_sess137_msa175@linklings.com
SUMMARY:Enabling Industrialized Analysis of Textual Documents in Data Lake
 s
DESCRIPTION:Minisymposium\n\nEnabling Industrialized Analysis of Textual D
 ocuments in Data Lakes\n\nSawadogo\n\nThe concept of data lake was introdu
 ced in 2010 by James Dixon as an alternative to data warehouses for big da
 ta analysis and management. Unlike data warehouses, data lakes follow a sc
 hema-on-read approach to better support ad’hoc analyses. In the absence of
  a fixed schema, data from the lake can be handled miscellaneously. This h
 owever makes hard industrialized analyses from data lakes. More recently, 
 the concept of data lakehouse has been proposed as a solution to activate 
 industrialized analyses in data lakes. That consists to merge the better f
 rom data lake and data warehouse concepts. Nevertheless, data lakehouses s
 till limited as they essentially focus on structured data management. Yet,
  the majority of big data is made by unstructured data, amongst which text
 ual data. To remedy the limitations of data lakehouses we introduce a new 
 approach to activate industrialized analyses on textual documents from a d
 ata lake. Our approach is based on techniques from information retrieval a
 nd text-mining domains. In this presentation, we particularly focus on arc
 hitecting and metadata management which are essential issues while buildin
 g a data lake system.\n\nDomain: Computer Science and Applied Mathematics
END:VEVENT
END:VCALENDAR
