This week the second year students presented their advice on how to get some productive work done over the Christmas break.

Jake made a video which summarizes things pretty well:

This week the second year students presented their advice on how to get some productive work done over the Christmas break.

Jake made a video which summarizes things pretty well:

Lingnan Dai presented his Part II project on Automated essay scoring

Abstract: One interesting application of Natural Language Processing today is Automatic Essay Scoring (AES), which is the technology that automatically evaluates and scores the quality of writing. The goal of AES is to build models that evaluate writing as reliably as human readers, while providing many advantages over manual marking, such as constant application of marking criteria and faster assessment. I’ll briefly introduce the usual procedure underlying such a system, highlighting textual parsing, feature extraction and machine learning, as well as giving examples on different approaches currently deployed in commercial and academic applications. This will be followed by a more in-depth look at the Entity-based coherence model I’m developing in my Part II Project and the evaluation schemes used to test and evaluate an AES.

We had a fun discussion at dinner about a maths problem that Sid knew: given the numbers 1 to n, on average how many of them do we need to pick at random so that the sum is greater than n. The answer is that as n gets large the expected number of draws tends to ‘e’ (i.e. 2.718…).

We couldn’t work this out for ourselves at the table but fortunately there was a mathematician near by. James typed up his proof :It tends to ‘e’. (The proof makes use of Markov Chains which are not covered until the second year course Mathematical Methods for Computer Science.)

Craig Saperstein presented his Part II project on using grammar induction for text compression.

Grammar induction is a way of doing lossless compression that has

been making recent progress. Variants are achieving 5-10% better

compression ratios on DNA sequences than gzip or bzip2 and comparable

performance on standard text data sets. I’ll be talking briefly about

the field, followed by an in-depth look at a particular grammar

induction method that I’m implementing for my part II project.

Stephen Cook presented his Part II project on gaze tracking with a laptop.

Abstract: Computer vision is becoming more and more prevalent in many

areas, from medical analysis to robotics. The area is still far behind

the abilities of a human, and this makes problems like gaze-tracking

very difficult to solve reliably. I will give an overview of the

algorithms needed to create a gaze-tracker; using rectangles to detect

faces, and gradient analysis to detect eyes. These algorithms are used

widely in industry, and are likely the same as used by devices such as

digital cameras, the Kinect, and autonomous vehicles (such as the Mars

Rover, and the Google Car).