The workshop will bring together both bioinformaticians and systems researchers who have a shared interest and/or experience in genomics and Hadoop. We will have speakers who have experience building Hadoop NGS pipelines, (MapReduce SEAL, PigSeq and Cuneiform/HiWAY), new formats for representing NGS data in Hadoop, (Hadoop-BAM), and support for security in Hadoop (BiobankCloud). We will also have speakers on systems and operations issues of building and scheduling Hadoop pipelines in frameworks such as Spark and MapReduce on YARN. The second day of the workshop will be practical, where small working groups will work on specific problems in NGS data on Hadoop, including a hackathon on a platform such as Adam/Cuneiform/SEAL/PigSeq.
Registration is free and includes lunches:
The workshop will be free of charge, as it is sponsored by the Big Data Working Group at KTH, SeRC and the EU FP7 project BiobankCloud.