# Introduction to Algorithms:Course Information Sheet

### January 2001

Course code: DCS/127, Semester: 2.
Acknowledgements: Richard Bornat, Samin Ishtiaq and Jules Bean.
Resources: my TESTER, some textbooks and links.

## 1  Aims

• To understand why formal logic is essential for getting programs to work correctly, and
• to learn some simple methods using logical notation to specify properties of programs.
• To give an appreciation of the range and depth of knowledge of programming solutions to apparently simple problems;
• to increase perception of the beauty and intricacy of known algorithms;
• to increase understanding of the role of analysis in the construction of practical programs;
• to become more reflective when designing programs.

## 2  Objectives

• To understand and be able to employ predicate calculus notation in the specification of several algorithms, especially in describing the invariants of loops;
• to understand and be able to apply big-O notation for worst-case complexity analysis of programs including the algorithms treated in the course;
• to understand and be able to reproduce some algorithms for sorting and searching;
• to be able to construct arguments about the relative worst-case space and time complexity of some algorithms for sorting and searching.

## 3  Prerequisites

Introduction to Logic (DCS/122) and Programming I (DCS/101) are pre-requisites.
An understanding of mathematics up to A-level will be useful for the complexity analysis of the algorithms presented.
The essential skill is precise thinking, of a kind similar to that used in mathematics, though very little use is made of particular facts from the A-level (or degree-level) mathematics syllabus.

## 4  Description

The study of algorithms - carefully-crafted solutions to particularly important programming problems - is one of the oldest areas in Computer Science. In order truly to understand an algorithm you have first to understand why it works at all, then to understand how it performs in general - how long it takes and how much space it takes up while it is working - and last, understand how to tweak the algorithm to make its performance fit the particular circumstances in which you want to use it. In the case of some of the famous algorithms we shall meet, like quicksort, you also have to understand why it isn't easy to do better.
So the course will focus on two issues: correctness - how you say what an algorithm is supposed to do, and how you can argue that it actually does it - and so-called complexity - how to estimate the amount of work an algorithm must do to perform its task in general, or on average, or in the worst case. There will be some experimental work, trying to verify in practice the theoretical measures that can be developed by using mathematical arguments, but much of the course will be about analysing algorithms with pencil and paper.
Some of the algorithms we shall analyse are intensely beautiful. To appreciate their beauty you have to realise how much better they are than the first naive attempts one of us might make to solve the same problem, and how concisely they solve the problem which they are designed to solve. All of the algorithms we shall analyse are useful. Some of them are not entirely understood - it is a recurring feature of Computer Science that we continually invent things which we only partly understand.

## 5  Course Plan

From each of the algorithms that we study, we learn some general principle about the correctness of programs in general. These general principles, and not programming recipes for these particular examples, are what you are meant to take away from this course.
• Linear search: writing a method that satisfies a specification, fitting it in to a test program, and proving its correctness.
• Binary search: there are at least three published algorithms, differing by ±1 in various important places.
• Array shifting: assignment messes up the logic.
• Insertion sort: search and shift are used as subroutines in this quadratic algorithm.
• Selection sort: another quadratic sorting algorithm.
• Merge sort: this O(nlogn) algorithm is your first recursive program.
• Quick sort: another recursive program, that goes fast when you get a design decision right, it's easy to get this wrong.
• Heap sort and priority queues: your first object-oriented program.
Some harder topics will be treated in the lectures at the end of the course but will be optional for assessment:
• Hash tables: how to insert, look up and delete data by name very quickly.
• Finding a path through a network

## 6  Assessment

There are three items of assessment, each worth 15% of the course marks, together with the exam (55%). Two of these items are tests (mini-exams) conducted during the periods timetabled for lectures, and the third is work to be done during lab sessions and handed in by a deadline.
The relatively high percentage of test marks is intended to deter the 'listen; forget; revise' style of course attendance. All of the assessment is intended to make you think, and will contain traps to deter copying and rote learning.

• Data Structures and Problem Solving Using Java by Mark Allen Weiss, (Addison-Wesley);
• Reasoned Programming by Krysia Broda, Susan Eisenbach, Hessam Khoshnevisan and Steve Vickers, Prentice Hall, International Series in Computer Science, 1994.

## 8  Calendar (2000-1)

• Monday 11am-noon, lecture, Skeel Lecture Theatre, People's Palace
• Monday 2-6pm, supervised labs, ITL ground floor
• Tuesday 3-5pm, lecture, Physics Lecture Theatre
The middle item in each triplet is the lab work; the other two are lectures.
• 15/16 January (Week 1)
• Themes, timetable, assessment, resources, this week's work.
• Linear search (from left and right )
• Using semi-formal logic to describe programs
• 22/23 January (Week 2)
• 29/30 January (Week 3)
• 5/6 February (Week 4)
• 12/13 February (Week 5)
• 19/20 February (Week 6)
• Recursion; the idea of mergesort+quicksort; merge loop invt.
• Implementing merge and mergesort
• Binary trees, logarithms, exponentials, complexity. Hybrid mergesort.
• 26/27 February (Week 7)
• Test in the Great Hall on the results of lab work.
• Implementing mergesort and quicksort.
• Specifications. Split algorithms and choice of pivot for quicksort.
• 5/6 March (Week 8, Eid-al-Adha)
• Lecture cancelled for Eid.
• Implementing split and quicksort.
• Heaps, priority queues and heap sort
• 12/13 March (Week 9)
• 19/20 March (Week 10)
• Lecture cancelled because I was at a
• Implement heapification, heap sort and priority queues.
• Hash tables for fast access to dictionaries.
• 26/27 March (Week 11)
• Additional coursework options or finish regular coursework.
• 2/3 April (Week 12)
• Recursion- and data- (class-) invariants (non-examinable)
• Additional coursework options or finish regular coursework.
• Revision lecture.
• Friday 6 April (End of term): Coursework deadline
• Thursday 9 May
• exam at 10am in Stratford Old Town Hall.

## 9  Monitoring student work

If you are a student registered for this course, you should find in your personal filespace a directory (folder) called DCS127.
All of your work for this course must be done in this directory.
(Think of it as the electronic equivalent of a school exercise book for a particular subject.)
As the various pieces of lab work for the course are set, template files will be copied into this directory for you to work on. They contain the structure of the methods and classes that you have to implement, instructions for writing and testing your code, and questions for you to answer about the behaviour of the code.
Your answers to these questions, as well as your JAVA code, should be written in these files. Then all of the things that you are supposed to learn about each algorithm will be collected together in one place for you to learn and revise.
The directory also contains a short-cut (pt) to the teaching materials for the course. It is also the only place in which you can do the run, trace and check commands that invoke the automatic tester.
Everything that you do in this directory will be copied overnight to a place where your lecturer, teaching assistants, adviser and tutor can read it.
Only the contents of this directory will be copied and made available to these people. Other parts of your personal filespace (such as your email records) will not be copied. Other students will not be able to see the copy; in fact the access permissions on this directory itself are automatically reset so that only you, and not other students, can read its contents.
Only plain text files ( JAVA source files and notes that you make using xemacs in plain ASCII) will be copied. Compiled JAVA classes and files that appear to be mail or Microsoft Word format are not copied. Nor are directory trees called private, not_mine or RCS, and some other types of file 1 are also excluded.
When files are copied, a "receipt" for them is recorded in the file COPY_LOG. You cannot delete or modify this file yourself.
Modification histories of the files are also maintained (in the place where the copies are made) using the Unix RCS tool. It may be possible to recover old versions of the file from this.
The purpose of this is to ensure that you do the lab work that is set each week.
To submit your end of term coursework, you simply leave it in this directory.
The TESTER will be run on the copied files, but they will be printed out and looked at individually before being marked.
Files that you put in this directory must be entirely your own work.
Since some of the things that you put here will be marked as coursework, contributing to your final grade for this course and ultimately to your degree, they must be treated in the same way as exam scripts. The College examination regulations therefore apply.
If you wish to keep materials in this directory that are relevant to your study of this course, but are not your own work (for example, code that you have downloaded from the Web), you must put it in the subdirectory not_mine. It will then not be treated as coursework.

## 10  Assessment of coursework

The assessment for this course consists of 55% for the May exam and 15% for each of three items of term-time work, two of these being tests.
All lab work is collected automatically from the directory DCS127 in your filespace, every night. All work for this course should be done in this directory, as a matter of routine. In particular, you "submit" coursework simply by leaving it there.
There are two alternative ways of gaining the third 15% of marks:
• On the basis of your weekly coursework, or
• By doing one of the options listed below in the last three weeks of term.
You will be awarded the better of the two marks. You do not need to tell me which you want to count.
• To gain marks for each week's work (approximately 1% of the final credit per week), it must be done on time, not at the end of term.
• Normally, the deadline is the end of the day of the lab, but it will be set and changed each week depending on how difficult the class as a whole has found the exercise. No extensions of deadlines will be given for colds or single absences (and you will only cause annoyance by asking for them), but of course allowance will be made for medical problems that last several weeks. The intention behind this method of assessment is to persuade you to do the work that is set in each week's lab.
• Formally, the assessment will be for the term's work collectively, not for individual items. Assessment will include a elements for programming style, etc., in which the items that have been submitted will be examined qualitatively, not quantitatively.
• Do not think that you can miss labs and make up the marks by submitting coursework at the end of term instead. Such work will be examined critically and suspiciously for correctness and evidence of copying. But lab work is there to help you learn, so it is rather unlikely that you will be able to do the (more difficult) end of term coursework well unless you have done the earlier stuff properly.
• The optional topics are intended for the best students. Mediocre work will get far fewer marks than it did last year.

## 11  The optional topics

• Hash tables for creating, accessing and updating dictionaries.
• Finding a path between two nodes in a network was covered in 2000-1 for the first time.
• External sorting of a list of names (such as the US population) that fits on disk but not in RAM. You have to do your own reading on this option - there will be no lectures.
• Heap sort and priority queues were an option in 1998-9 and 1999-2000, but became a core part of the course (with structured lab work and exam question) in 2000-1.
• Minimax and alpha-beta pruning for playing games such as chess or draughts. This was an option in 1998-9 and 1999-2000, prompted by the Software Engineering projects about implementing games, but not very many students took it, so it was droppped in 2000-1.
• Black box testing of an alleged sorting method was an option in 1999-2000, but only one student (Daniel Dor-Chay) took it. He discovered the "sawtooth" array that makes quicksort quadratic, but otherwise this option was not a very good idea.
You can get credit for the options listed above twice:
• 15% for assessed lab-work. You have to implement the algorithm, and hand in your code on 6 April, along with some of the results of testing that the code works.
• Approximately 14% more for an optional essay in the exam.
Do not write an essay for the coursework. Postpone further reading and experimentation (beyond what is necessary to get your program working) until after you have handed in your program.

## 12  The essay question in the exam

You do not have to choose the same topic for the coursework and the exam, but it would obviously be more effective in gaining marks to do so. However, the coursework and exam test different aspects of the topic.
For the exam you are not expected to remember and reproduce 200 lines of code, but you will be expected to give the code for the core algorithms. Your essay should also summarise what you have learned about the algorithm from running and modifying your own program or my compiled model answer, my lectures, Richard Bornat's lecture notes, Weiss's book, and any other sources you can find.
The essay question was compulsory in May 1999, but in 2000 and 2001 students had to choose four questions out of eight, one of them being the essay; the 14% figure above is based on this.
Writing an "essay" in an exam is not a piece of English composition (although I would like you to know how to spell "pigeonhole", and the difference between "its" and "it's"). At school you were probably told to write "a beginning, a middle and an end", but any "introduction" or "conclusion" you write for this essay is unlikely to (contain material which will) gain you any marks. An exam essay is like one of those TV game shows where you have to remember as many items as possible that have gone past on a conveyor belt ("TV, teddy bear, toaster, ..."). How many facts can you remember about this topic?
The exam rubric for the essay question [in May 2000] read as follows:
Write an essay about one of the [above] topics.
Your answer should include a selection of the most important parts of the code: do not include JAVA declarations unless they illustrate some aspect of the algorithm. Explain the main points in proving correctness. State the main facts about complexity.
You are unlikely to gain any marks at all for this question unless you have implemented the algorithm in the lab sessions and thought carefully about its behaviour. Remember that this is an exam about algorithms, not about English composition, and that your answer will be marked according to the information that it contains about algorithms: wordy answers to questions like this gain very few marks.
The marking of both the coursework and essay question will be open-ended: the mark-scheme will give full marks for the basic issues, but generous bonus marks will be given for anything else you discover by experimentation with your program. Unfortunately, it is not within my power to give total marks above 100% for the course, or to give any prizes for supererogation.

## 13  Warning about plagiarism

It is the College's policy to punish plagiarism severely.
• This is part of your degree assessment, and is intended to be individual work.
• Copying from other students' work (with or without their consent) is plagiarism.
• If you work in pairs or larger groups, each member of the group must submit identical work, which must state the names of all of the members of the group and the agreed proportions of the work that they contributed. If you have had the help of others to do the work you submit then you will of course get fewer marks, to compensate.
• To copy code out from Richard Bornat's notes, Weiss's book or web pages, or from any other sources without identifying the source precisely is also plagiarism,
• If you use a book, you must give the author, title, publisher and page numbers.
• If you use the web you must give the full URL of the relevant page, not just the name of the site.
• Changing what you copy from books, web pages or other students counts as deliberate attempted deception. It is also insulting to expect someone (a lecturer or a colleague) to give individual attention to something that does not in fact have any original content.
Giving precise references is a basic professional habit, and is useful. For example if you've filched someone else's code like this, and adapted it to your own needs, and then it doesn't work, you (or your colleagues) can go directly back to the original source.
You don't have to cite your sources in the exam (in my course).

### Footnotes:

1Emacs backup files (file ∼  and #file#); "dot" files (.login); your "receipt" COPY_LOG; files whose names contain spaces, quotes or other strange characters; symbolic links; PostScript; PDF; executables; images (picture.gif); archives (tester.jar, folder.zip, folder.tar.gz).

This is www.PaulTaylor.EU/algorithms/info.html and it was derived from algorithms/info.tex which was last modified on 31 December 2007.