Solutions on Blackboard
Tuesday, January 4th, 2011, posted by Djoerd HiemstraSolutions for Assignment 4 (Sawzall) and for Assignment 5 (HBase Schema) are now on Blackboard.
Solutions for Assignment 4 (Sawzall) and for Assignment 5 (HBase Schema) are now on Blackboard.
The solutions to Assignment 3 are now on-line in the Course Material Section on Blackboard. You need the solutions for Assignment 4, deadline next Friday, 10 December.
Next Monday, 6 December at 14.30 - 15.15h. in ZI-3126, there is a short meeting to discuss the solutions for Assignment 2 and 3. The solutions, which are helpful for Assignment 4, will also be put on Blackboard.
Next Tuesday, 7 December: the Hadoop Hackathon!
The grades for Assignment 2 are now on Blackboard’s Grade Center. A correct solution for Assignment 2, which is needed for Assignment 3, can be found under “Course Materials” on Blackboard.
The grades for Assignment 1 are now on Blackboard’s Grade Center. Please, send me an email as soon as possible, if you cannot find your grades, if you cannot find an explanation of your grade (including a per question result), or if you did not submit solutions at all for Assignment 1, but still want to participate in the course. Deadline for Assignment 2 is next Friday, 26 November.
Friday 26 November, Peter Dickman from Google will talk about Google’s infrastructure. The lecture will start at 10:30 h. (so 15 minutes earlier than usual) in RA-1501.
This a rapid overview of the approach Google uses to develop and offer global products. I will briefly (and somewhat superficially) cover the whole of our infrastructure from physical systems, such as the data centers, through the software stack to our software development methodology and the corporate engineering culture that both builds and utilizes the infrastructure.
Peter Dickman is an engineering manager in Google’s main European engineering centre in Zurich. He is involved with both the internals of the Google search engine and projects to protect user data in Google’s systems. Prior to working at Google, Peter was an academic in the UK, researching large-scale distributed systems (though on arrival at Google he discovered what large really meant).
The crash course Functional Programming, intended to be able to describe the word count program in a functional language, will be given by Maarten Fokkinga in room Zilverling, West 1, on Friday Nov 19, 13:45-15:30. We’ll use programming language Amanda (one executable running under Windows), but to do the homework any other functional programming language, such as Haskell, may be used as well. A download for Amanda is given at the material for Assignment 2.
On December 7, SARA (the Dutch National High Performance Computing and e-Science Support Center) organizes a day-long hackathon to kick-off a Proof-of-Concept Hadoop service, and give the opportunity to experiment with Hadoop with support of experienced users. People who are interested can work with Hadoop on a case of choice, or only play with datasets like Wikipedia, the ENRON dataset, White House visitor records, Genome data or others.
Welcome to Distributed Data Processing using MapReduce
This will be a course that is on top of some very exciting developments in cloud computing and data centers, initiated by Google, and followed by many others such as Yahoo, Amazon, AOL, Baidu, Joost, Mylife, Facebook, etc., etc. The course is about processing terabytes of data on large clusters. But not only that, not many courses in the master’s Computer Science will be so “core computer science”: We will discuss new file systems (GFS and Hadoop FS), new programming paradigms (MapReduce), new programming languages and query languages (Sawzall, Pig Latin), and new Database paradigms (BigTable, Cassandra and Dynamo), and of course many web search and data mining applications that made Google one of today’s leading IT companies.
We hope to see you at our lectures on Friday’s 3/4 hour.
Robin Aly, Maarten Fokkinga, and Djoerd Hiemstra.
Enschede will open an expertise centre for cloud computing on Thursday 17 June. The Centre 4 Cloud Computing will support open innovation and the sharing of knowledge on cloud computing. Cloud computing is an Internet-based computing paradigm, whereby shared resources, software and information are provided on-demand in a highly scalable way.
The expertise centre offers companies and organisations the following: