Archive for the 'Course MapReduce' Category

Solutions on Blackboard

Tuesday, January 4th, 2011, posted by Djoerd Hiemstra

Solutions for Assignment 4 (Sawzall) and for Assignment 5 (HBase Schema) are now on Blackboard.

Solutions to Assignment 3

Monday, December 6th, 2010, posted by Djoerd Hiemstra

The solutions to Assignment 3 are now on-line in the Course Material Section on Blackboard. You need the solutions for Assignment 4, deadline next Friday, 10 December.

Small Haskell wrap-up meeting

Friday, December 3rd, 2010, posted by Djoerd Hiemstra

Next Monday, 6 December at 14.30 - 15.15h. in ZI-3126, there is a short meeting to discuss the solutions for Assignment 2 and 3. The solutions, which are helpful for Assignment 4, will also be put on Blackboard.
Next Tuesday, 7 December: the Hadoop Hackathon!

Solution for Assignment 2

Monday, November 29th, 2010, posted by Djoerd Hiemstra

The grades for Assignment 2 are now on Blackboard’s Grade Center. A correct solution for Assignment 2, which is needed for Assignment 3, can be found under “Course Materials” on Blackboard.

Grades for Assignment 1 on Blackboard Grade Center

Tuesday, November 23rd, 2010, posted by Djoerd Hiemstra

The grades for Assignment 1 are now on Blackboard’s Grade Center. Please, send me an email as soon as possible, if you cannot find your grades, if you cannot find an explanation of your grade (including a per question result), or if you did not submit solutions at all for Assignment 1, but still want to participate in the course. Deadline for Assignment 2 is next Friday, 26 November.

Guest lecture by Peter Dickman from Google

Tuesday, November 16th, 2010, posted by Djoerd Hiemstra

Friday 26 November, Peter Dickman from Google will talk about Google’s infrastructure. The lecture will start at 10:30 h. (so 15 minutes earlier than usual) in RA-1501.

This a rapid overview of the approach Google uses to develop and offer global products. I will briefly (and somewhat superficially) cover the whole of our infrastructure from physical systems, such as the data centers, through the software stack to our software development methodology and the corporate engineering culture that both builds and utilizes the infrastructure.

Peter Dickman is an engineering manager in Google’s main European engineering centre in Zurich. He is involved with both the internals of the Google search engine and projects to protect user data in Google’s systems. Prior to working at Google, Peter was an academic in the UK, researching large-scale distributed systems (though on arrival at Google he discovered what large really meant).

Crash course Functional Programming

Friday, November 12th, 2010, posted by Maarten Fokkinga

The crash course Functional Programming, intended to be able to describe the word count program in a functional language, will be given by Maarten Fokkinga in room Zilverling, West 1, on Friday Nov 19, 13:45-15:30. We’ll use programming language Amanda (one executable running under Windows), but to do the homework any other functional programming language, such as Haskell, may be used as well. A download for Amanda is given at the material for Assignment 2.

SARA organizes Hadoop hackathon

Thursday, November 4th, 2010, posted by Djoerd Hiemstra

On December 7, SARA (the Dutch National High Performance Computing and e-Science Support Center) organizes a day-long hackathon to kick-off a Proof-of-Concept Hadoop service, and give the opportunity to experiment with Hadoop with support of experienced users. People who are interested can work with Hadoop on a case of choice, or only play with datasets like Wikipedia, the ENRON dataset, White House visitor records, Genome data or others.

See: SARA starts Apache Hadoop Proof-of-Concept.

Welcome to the MapReduce course

Friday, October 15th, 2010, posted by Djoerd Hiemstra

Welcome to Distributed Data Processing using MapReduce

This will be a course that is on top of some very exciting developments in cloud computing and data centers, initiated by Google, and followed by many others such as Yahoo, Amazon, AOL, Baidu, Joost, Mylife, Facebook, etc., etc. The course is about processing terabytes of data on large clusters. But not only that, not many courses in the master’s Computer Science will be so “core computer science”: We will discuss new file systems (GFS and Hadoop FS), new programming paradigms (MapReduce), new programming languages and query languages (Sawzall, Pig Latin), and new Database paradigms (BigTable, Cassandra and Dynamo), and of course many web search and data mining applications that made Google one of today’s leading IT companies.

We hope to see you at our lectures on Friday’s 3/4 hour.
Robin Aly, Maarten Fokkinga, and Djoerd Hiemstra.

Expertise centre for cloud computing

Thursday, June 3rd, 2010, posted by Djoerd Hiemstra

Enschede will open an expertise centre for cloud computing on Thursday 17 June. The Centre 4 Cloud Computing will support open innovation and the sharing of knowledge on cloud computing. Cloud computing is an Internet-based computing paradigm, whereby shared resources, software and information are provided on-demand in a highly scalable way.

Cloud computing logical diagram

The expertise centre offers companies and organisations the following:

  1. Knowledge Exchange: To make (applied) knowledge and best practices available to professionals, management and other interested parties
  2. Research: Scientific applied research into technical, security, legal, and business aspects of cloud computing
  3. Commercial: Contribute to business development for companies that offer services based on cloud computing solutions
For more information, see http://www.centre4cloud.com