Monthly Archives: November 2013

What to do over the Christmas break?

This week the second year students presented their advice on how to get some productive work done over the Christmas break.

Jake made a video which summarizes things pretty well:


Automatic Essay Scoring

Lingnan Dai presented his Part II project on Automated essay scoring

Abstract: One interesting application of Natural Language Processing today is Automatic Essay Scoring (AES), which is the technology that automatically evaluates and scores the quality of writing. The goal of AES is to build models that evaluate writing as reliably as human readers, while providing many advantages over manual marking, such as constant application of marking criteria and faster assessment. I’ll briefly introduce the usual procedure underlying such a system, highlighting textual parsing, feature extraction and machine learning, as well as giving examples on different approaches currently deployed in commercial and academic applications. This will be followed by a more in-depth look at the Entity-based coherence model I’m developing in my Part II Project and the evaluation schemes used to test and evaluate an AES.

Summing some random numbers from 1 to n

We had a fun discussion at dinner about a maths problem that Sid knew: given the numbers 1 to n, on average how many of them do we need to pick at random so that the sum is greater than n. The answer is that as n gets large the expected number of draws tends to ‘e’ (i.e. 2.718…).

We couldn’t work this out for ourselves at the table but fortunately there was a mathematician near by. James typed up his proof :It tends to ‘e’. (The proof makes use of Markov Chains which are not covered until the second year course Mathematical Methods for Computer Science.)

Grammar induction for text compression

Craig Saperstein presented his Part II project on using grammar induction for text compression.

Grammar induction is a way of doing lossless compression that has
been making recent progress. Variants are achieving 5-10% better
compression ratios on DNA sequences than gzip or bzip2 and comparable
performance on standard text data sets. I’ll be talking briefly about
the field, followed by an in-depth look at a particular grammar
induction method that I’m implementing for my part II project.

Computer Vision & Gaze Tracking

Stephen Cook presented his Part II project on gaze tracking with a laptop.

Abstract: Computer vision is becoming more and more prevalent in many
areas, from medical analysis to robotics. The area is still far behind
the abilities of a human, and this makes problems like gaze-tracking
very difficult to solve reliably. I will give an overview of the
algorithms needed to create a gaze-tracker; using rectangles to detect
faces, and gradient analysis to detect eyes. These algorithms are used
widely in industry, and are likely the same as used by devices such as
digital cameras, the Kinect, and autonomous vehicles (such as the Mars
Rover, and the Google Car).