Introduction to Algorithms:
Course Information Sheet
Paul Taylor
January 2001
Course code: DCS/127, Semester: 2.
Acknowledgements:
Richard Bornat,
Samin Ishtiaq
and
Jules Bean.
Resources:
my TESTER,
some textbooks and links.
1 Aims
- To understand why formal logic is essential for getting programs
to work correctly, and
- to learn some simple methods using logical notation to specify
properties of programs.
- To give an appreciation of the range and depth of knowledge of
programming solutions to apparently simple problems;
- to increase perception of the beauty and intricacy of known
algorithms;
- to increase understanding of the role of analysis in the
construction of practical programs;
- to become more reflective when designing programs.
2 Objectives
- To understand and be able to employ predicate calculus notation in
the specification of several algorithms, especially in describing
the invariants of loops;
- to understand and be able to apply big-O notation for worst-case
complexity analysis of programs including the algorithms treated in
the course;
- to understand and be able to reproduce some algorithms for
sorting and searching;
- to be able to construct arguments about the relative worst-case
space and time complexity of some algorithms for sorting and
searching.
3 Prerequisites
Introduction to Logic (DCS/122) and Programming I (DCS/101) are
pre-requisites.
An understanding of mathematics up to A-level will be useful for
the complexity analysis of the algorithms presented.
The essential skill is precise thinking, of a kind similar to
that used in mathematics, though very little use is made of particular
facts from the A-level (or degree-level) mathematics syllabus.
4 Description
The study of algorithms - carefully-crafted solutions to particularly
important programming problems - is one of the oldest areas in
Computer Science. In order truly to understand an algorithm
you have first to understand why it works at all, then to understand
how it performs in general - how long it takes and how much space it
takes up while it is working - and last, understand how to tweak the
algorithm to make its performance fit the particular circumstances in
which you want to use it. In the case of some of the famous algorithms
we shall meet, like quicksort, you also have to understand why it
isn't easy to do better.
So the course will focus on two issues: correctness - how you
say what an algorithm is supposed to do, and how you can argue that it
actually does it - and so-called complexity - how to
estimate the amount of work an algorithm must do to perform its task
in general, or on average, or in the worst case. There will be some
experimental work, trying to verify in practice the theoretical
measures that can be developed by using mathematical arguments, but
much of the course will be about analysing algorithms with pencil and
paper.
Some of the algorithms we shall analyse are intensely beautiful. To
appreciate their beauty you have to realise how much better they are
than the first naive attempts one of us might make to solve the same
problem, and how concisely they solve the problem which they are
designed to solve. All of the algorithms we shall analyse are useful.
Some of them are not entirely understood - it is a recurring feature
of Computer Science that we continually invent things which we only
partly understand.
5 Course Plan
From each of the algorithms that we study, we learn some general
principle about the correctness of programs in general. These general
principles, and not programming recipes for these particular examples,
are what you are meant to take away from this course.
- Linear search: writing a method that satisfies a specification,
fitting it in to a test program, and proving its correctness.
- Binary search: there are at least three published algorithms,
differing by ±1 in various important places.
- Array shifting: assignment messes up the logic.
- Insertion sort: search and shift are used as subroutines
in this quadratic algorithm.
- Selection sort: another quadratic sorting algorithm.
- Merge sort: this O(nlogn) algorithm is
your first recursive program.
- Quick sort: another recursive program, that goes fast
when you get a design decision right, it's easy to get this wrong.
- Heap sort and priority queues: your first
object-oriented program.
Some harder topics will be treated in the lectures at the end of the course
but will be optional for assessment:
- Hash tables: how to insert, look up and delete data by
name very quickly.
- Finding a path through a network
6 Assessment
There are three items of assessment, each worth 15% of the
course marks, together with the exam (55%). Two of these items are
tests (mini-exams) conducted during the periods timetabled for
lectures, and the third is work to be done during lab sessions and
handed in by a deadline.
The relatively high percentage of test marks is intended to deter the
'listen; forget; revise' style of course attendance. All of the
assessment is intended to make you think, and will contain
traps to deter copying and rote learning.
7 Reading
- Data Structures and Problem Solving Using Java
by Mark Allen Weiss, (Addison-Wesley);
- Reasoned Programming
by Krysia Broda, Susan Eisenbach,
Hessam Khoshnevisan
and Steve Vickers, Prentice Hall,
International Series in Computer Science, 1994.
8 Calendar (2000-1)
- Monday 11am-noon, lecture,
Skeel Lecture Theatre, People's Palace
- Monday 2-6pm, supervised labs, ITL ground floor
- Tuesday 3-5pm, lecture, Physics Lecture Theatre
The middle item in each triplet is the lab work; the other two are lectures.
- 15/16 January (Week 1)
- Themes, timetable, assessment, resources, this week's work.
- Linear search (from left and right )
- Using semi-formal logic to describe programs
- 22/23 January (Week 2)
- 29/30 January (Week 3)
- 5/6 February (Week 4)
- 12/13 February (Week 5)
- 19/20 February (Week 6)
- Recursion; the idea of mergesort+quicksort; merge loop invt.
- Implementing merge and mergesort
- Binary trees, logarithms, exponentials, complexity. Hybrid mergesort.
- 26/27 February (Week 7)
- Test
in the Great Hall on the results of lab work.
- Implementing mergesort and quicksort.
- Specifications. Split algorithms and choice of pivot for quicksort.
- 5/6 March (Week 8, Eid-al-Adha)
- Lecture cancelled for Eid.
- Implementing split and quicksort.
- Heaps, priority queues and heap sort
- 12/13 March (Week 9)
- 19/20 March (Week 10)
- Lecture cancelled because I was at a
- Implement heapification, heap sort and priority queues.
- Hash tables for fast access to dictionaries.
- 26/27 March (Week 11)
-
- Additional coursework options
or finish regular coursework.
-
- 2/3 April (Week 12)
- Recursion- and data- (class-) invariants (non-examinable)
- Additional coursework options or finish regular coursework.
- Revision lecture.
- Friday 6 April (End of term):
Coursework deadline
- Thursday 9 May
- exam at 10am in Stratford Old Town Hall.
9 Monitoring student work
If you are a student registered for this course, you should find in
your personal filespace a directory (folder) called DCS127.
All of your work for this course must be done in this directory.
(Think of it as the electronic equivalent of a school exercise book
for a particular subject.)
As the various pieces of lab work for the course are set, template
files will be copied into this directory for you to work on. They
contain the structure of the methods and classes that you have to
implement, instructions for writing and testing your code, and
questions for you to answer about the behaviour of the code.
Your answers to these questions, as well as your JAVA code, should be
written in these files. Then all of the things that you are supposed
to learn about each algorithm will be collected together in one place
for you to learn and revise.
The directory also contains a short-cut (pt) to the teaching
materials for the course. It is also the only place in which
you can do the run, trace and check commands that invoke the automatic tester.
Everything that you do in this directory will be copied overnight
to a place where your lecturer, teaching assistants, adviser and tutor
can read it.
Only the contents of this directory will be copied and made
available to these people. Other parts of your personal filespace
(such as your email records) will not be copied. Other
students will not be able to see the copy; in fact the access
permissions on this directory itself are automatically reset so that
only you, and not other students, can read its contents.
Only plain text files ( JAVA source files and notes that you
make using xemacs in plain ASCII) will be copied. Compiled JAVA classes and files that appear to be mail or Microsoft Word format
are not copied. Nor are directory trees called private,
not_mine or RCS, and some other types of file 1
are also excluded.
When files are copied, a "receipt" for them is recorded in the file
COPY_LOG. You cannot delete or modify this file yourself.
Modification histories of the files are also maintained (in the place
where the copies are made) using the Unix RCS tool. It may be
possible to recover old versions of the file from this.
The purpose of this is to ensure that you do the lab work
that is set each week.
To submit your end of term coursework,
you simply leave it in this directory.
The TESTER will be run on the copied files, but
they will be printed out and looked at individually before being
marked.
Files that you put in this directory must be entirely your
own work.
Since some of the things that you put here will be marked as
coursework, contributing to your final grade for this course and
ultimately to your degree, they must be treated in the same way as
exam scripts. The College
examination regulations therefore apply.
If you wish to keep materials in this directory that are relevant to
your study of this course, but are not your own work (for example,
code that you have downloaded from the Web), you must put it in the
subdirectory not_mine. It will then not be treated as
coursework.
10 Assessment of coursework
The assessment for this course consists of 55% for the May exam
and 15% for each of three items of term-time work, two of these being tests.
All lab work is collected automatically from the directory DCS127 in your filespace, every night. All work for this
course should be done in this directory, as a matter of routine. In
particular, you "submit" coursework simply by
leaving it there.
There are two alternative ways of gaining the third 15% of marks:
- On the basis of your weekly coursework, or
- By doing one of the options listed below in the last three weeks of term.
You will be awarded the better of the two marks.
You do not need to tell me which you want to count.
Please note:
- To gain marks for each week's work (approximately 1% of the final
credit per week), it must be done on time, not at the end of term.
- Normally, the deadline is the end of the day of the lab,
but it will be set and changed each week depending on how difficult
the class as a whole has found the exercise.
No extensions of deadlines will be given for colds or single
absences (and you will only cause annoyance by asking for them),
but of course allowance will be made for medical problems
that last several weeks.
The intention behind this method of assessment is to persuade you to
do the work that is set in each week's lab.
- Formally, the assessment will be for the term's work collectively, not for individual items. Assessment will include a
elements for programming style, etc., in which the items that have
been submitted will be examined qualitatively, not quantitatively.
- Do not think that you can miss labs and make up the marks by
submitting coursework at the end of term instead.
Such work will be examined critically and suspiciously
for correctness and evidence of copying.
But lab work is there to help you learn, so it is rather unlikely
that you will be able to do the (more difficult) end of term
coursework well unless you have done the earlier stuff properly.
- The optional topics are intended for the best students.
Mediocre work will get far fewer marks than it did last year.
11 The optional topics
- Hash tables
for creating, accessing and updating dictionaries.
- Finding a path between two nodes in a network
was covered in 2000-1 for the first time.
- External sorting of a list of names (such as the US population)
that fits on disk but not in RAM.
You have to do your own reading on this option
- there will be no lectures.
- Heap sort and priority queues were
an option in 1998-9 and 1999-2000, but became a core part of the
course (with structured lab work and exam question) in 2000-1.
- Minimax and alpha-beta pruning
for playing games such as chess or draughts.
This was an option in 1998-9 and 1999-2000, prompted by the Software
Engineering projects about implementing games, but not very many
students took it, so it was droppped in 2000-1.
- Black box testing
of an alleged sorting method was an option in 1999-2000,
but only one student (Daniel Dor-Chay) took it.
He discovered the "sawtooth" array that makes quicksort quadratic,
but otherwise this option was not a very good idea.
You can get credit for the options listed above twice:
- 15% for assessed lab-work. You have to implement the algorithm, and hand in your code on 6 April, along
with some of the results of testing that the code works.
- Approximately 14% more for an optional essay in the
exam.
Do not write an essay for the coursework. Postpone further
reading and experimentation (beyond what is necessary to get your
program working) until after you have handed in your program.
12 The essay question in the exam
You do not have to choose the same topic for the coursework and the exam,
but it would obviously be more effective in gaining marks to do so.
However, the coursework and exam test different aspects of the topic.
For the exam you are not expected to remember and reproduce 200 lines of
code, but you will be expected to give the code for the core
algorithms. Your essay should also summarise what you have learned about the algorithm from running and modifying your own
program or my compiled model answer, my lectures, Richard Bornat's
lecture notes, Weiss's book, and any other sources you can find.
The essay question was compulsory in May 1999, but in 2000 and 2001
students had to choose four questions out of eight, one of them being
the essay; the 14% figure above is based on this.
Writing an "essay" in an exam is not a piece of English
composition (although I would like you to know how to spell
"pigeonhole", and the difference between "its" and "it's"). At
school you were probably told to write "a beginning, a middle and an
end", but any "introduction" or "conclusion" you write for this
essay is unlikely to (contain material which will) gain you any marks.
An exam essay is like one of those TV game shows where you have to
remember as many items as possible that have gone past on a conveyor
belt ("TV, teddy bear, toaster, ..."). How many facts can you
remember about this topic?
The exam rubric for the essay question [in May 2000] read as
follows:
Write an essay about one of the [above] topics.
Your answer should include a selection of the most important parts of
the code: do not include JAVA declarations unless they illustrate some
aspect of the algorithm. Explain the main points in proving
correctness. State the main facts about complexity.
You are unlikely to gain any marks at all for this question unless you
have implemented the algorithm in the lab sessions and thought
carefully about its behaviour. Remember that this is an exam about
algorithms, not about English composition, and that your answer will
be marked according to the information that it contains about
algorithms: wordy answers to questions like this gain very few marks.
The marking of both the coursework and essay question will be open-ended:
the mark-scheme will give full marks for the basic issues,
but generous bonus marks will be given for anything else you discover by
experimentation with your program.
Unfortunately,
it is not within my power to give total marks above 100% for the course,
or to give any prizes for supererogation.
13 Warning about plagiarism
It is
the College's policy
to punish plagiarism severely.
- This is part of your degree assessment,
and is intended to be individual work.
- Copying from other students' work (with or without their consent)
is plagiarism.
- If you work in pairs or larger groups,
each member of the group must submit identical work,
which must state the names of all of the members of the group
and the agreed proportions of the work that they contributed.
If you have had the help of others to do the work you submit
then you will of course get fewer marks, to compensate.
- To copy code out from Richard Bornat's notes, Weiss's book or web pages,
or from any other sources without identifying the source precisely
is also plagiarism,
- If you use a book, you must give the
author, title, publisher and page numbers.
- If you use the web you must give the full URL of the
relevant page, not just the name of the site.
- Changing what you copy from books, web pages or other students
counts as deliberate attempted deception.
It is also insulting to expect someone (a lecturer or a colleague)
to give individual attention to something that does not in fact
have any original content.
Giving precise references is a basic professional habit, and is
useful. For example if you've filched someone else's code
like this, and adapted it to your own needs, and then it doesn't work,
you (or your colleagues) can go directly back to the original source.
You don't have to cite your sources in the exam (in my course).
Footnotes:
1Emacs backup files (file ∼ and #file#);
"dot" files (.login); your "receipt" COPY_LOG;
files whose names contain spaces, quotes or other strange characters;
symbolic links; PostScript; PDF; executables;
images (picture.gif); archives (tester.jar,
folder.zip, folder.tar.gz).This is
www.PaulTaylor.EU/algorithms/info.html
and it was derived from algorithms/info.tex
which was last modified on 31 December 2007.