Literature & “ebook” Collections


Here we list sites that either collect printed works or collect literary works, e.g. poems, novels, plays, etc., but which are not focussed on a specific type of printed work, e.g. scientific journals or archaeological publications.

The e-book can be any type of digital publication, and there are specific collections out there providing free electronic versions of printed books in different formats, e.g. from html, doc. and .epub to .azw3.

Project Gutenberg is the grandfather of free literary content on the Web. All 46,000 “ebooks” are in plain ASCII and in the public domain. The site points to partners who collectively make available more than 100,000 free ebooks. As an example, during one of my visits to the Website they had just digitised and encoded “Writing and Drawing Made Easy”, a novel entitled “Kathleen’s Diamonds”, and the non-fictions books “Stanley in Africa”, and “The Evolutionist at Large”.

The Alex Catalogue provides a vast collection of full-text of 100‘s of old books, listed by title or author. The focus in American and English literature and Western philosophy as part of a liberal arts education. Every book I checked was a download from the Project Gutenberg collection, but you can just read the lists and click through to the book of your choice. Some examples I looked at included “Aeroplanes and Dirigibles of War” and “La Sainte Courtisane” by Oscar Wilde. 

The Project Bartleby from Columbia University reproduces free of charge classics of literature, non-fiction, and reference. During one visit to their Website the featured “authors” included Emily Dickinson, Wordsworth, T.S.Eliot, Sigmund Freud, US Presidents Inaugural Addresses, and more than 1,200 engravings of Gray’s Anatomy (mind you when I returned some 6 months later the same authors were still featured). More generally “reference” includes collections of quotations, sayings, proverbs, and similes. “Verse” covered everything from Elizabethan verse to the poets of Transcendentalism. “Fiction” includes the Harvard Classics, the Fables of Aesop, and the tales of Hans Christian Andersen. “Non-fiction” includes the literature of social protest, through the works of Francis Bacon, to the sayings of Confucius.

The Online Books Page consists of an index to more than 2 million freely available online books and pointers to significant directories and archives. During one of my visits there was an interesting review about censorship and banned books, e.g. from Ulysses, Fanny Hill and even Canterbury Tales. It also has a collection of links to non-English languages sites (e.g. Austrian Literature Online) and to specialty archives (e.g. Antique Pattern Library).   

bibliomania has around 2,000 classic texts, including Erewhon, The Scarlet Pimpernel, Mansfield Park, The Mayor of Casterbridge, The Phantom of the Opera, and Alice’s Adventures in Wonderland. There is also a reference section and a list of study guides.

LibriVox is a site offering free public domain audiobooks. They claim to be home to more than 7,000 works read by more than 6,000 different volunteers, and it including more than 1,000 non-English works. Naturally some are boring and some highly interesting, and much depends upon the quality of the “reader”. If you want to check this out, try anything read by Andy Minter.

The Thrall public library in New York has a great section on “Literature” (under Research) with lists covering almost every imaginary view of the library, e.g. list of authors, classics, American literature, ebooks, history, poetry, quotations, etc. Back at the library home-page you can also find “research” lists on a multitude of other topics, from the arts to the weather (some of the lists are not that rich, but they have the merit of existing).  

The Literary Resources on the Net is a directory of English and American resources on topics such as classical & Biblical literature, the Renaissance, theatre & drama, ... It looks like it has not been updated since 2006, but it still looks quite valid and all the links appeared to still work.

English Literature on the Web is a massive collection of links to English literature resources (based in Japan). There are links to many topics including medieval, Renaissance, and 20th C literature. It was last updated in Oct. 2013, but all the links I tested still appear to work.

The MIT libraries has a quite a collection of links to scholarly literature and pointers to the full-texts of novels, plays, etc. There is a substantial list of Reference Works, a good list of pointers to literary periods and genres, a list of pointers by author, and a list of free e-journals. Just as an example you have a pointer to a site on Romanticism and Victorianism on the Net, which links to such articles as “A Portrait of the Monster as Criminal, or the Criminal as Outcast: Opposing Ætiologies of Crime in Mary Shelley’s Frankenstein”.

The University of Oxford Text Archive would appear to have more than 5,700 books in its collection, freely available in different formats (XML, HTML, ePub, mobi (Kindle), and plain text). During my visit I spotted Treasure Island, a series of plays by Shakespeare, an account of aether as an “extraordinary medicinal fluid” dating from 1761, and some practical hints on opium dating from 1790. Although a rather technical issue it is worth noting that the archive host many documents “marked up” according to the guidelines of the Text Encoding Initiative, and it also manages the distribution of the British National Corpus.

The British National Corpus is a collection of 100 million words of written and spoken 20th C British English (90% written and 10% spoken). These have been extracted from newspapers, periodicals, journals, books, and a wide range of unpublished texts, including even informal conversations, radio shows and phone-ins. The corpus has been encoded to identify different structural properties of the texts and the parts of speech. For example there is a simple search function that found 73 occurrences of the word “orchestration”, including “the orchestration of private power”, and no occurrences of the word “touchless” as relating to the operation of technology by gesture (this is certainly due to the fact that the corpus was established in 1991-1994). If your not quite sure what to do with a corpus check this out. The 2.5 billion word Oxford English Corpus is more inclusive of 21st C English words from the entire English-speaking world, and is the basis for the Oxford Dictionaries. Many languages have their own text corpera, for example you have the 450 million word corpus of contemporary American English, the 1.9 billion word 20-country corpus of global Web-based English, the 100’s of billions of words in the Goggle books corpus, the 15 million spoken word Open American National Corpus, the 100 million words of the Balanced Corpus of Contemporary Written Japanese, the 300 million words of the Russian National Corpus, and the 6 billion word German Corpus. We must also not forget that corpora are not always designed to be recent, for example you can find a 3.3 million word Base de Français Mediéval and the Corpus Vasorum Antiquorum has collected specialist corpora from all over the world concerning ancient vases. France appears to have a different view concerning corpora. I could not find a “central” French corpus but the Centre National de Ressources Textuelles et Lexicales lists a large collection of different corpora going from a portal providing access to different corpora, through Frantext with its 500 literary works, to specific corpora such as the one “journalistique” of l’Est Républicain. And you can always check out specific words and meanings using Le Trésor de la Langue Française Informatisé. The reality is that today many corpora are not freely available (or only a sub-set is available), and many corpora have very specific limitations, for example the inclusion or mix of “modern” transcripts of television, film or radio can vary as compared to “classical” fiction, technical and news transcripts, and the spoken parts are often based upon telephone hot-line or switchboard dialogues and not face-to-face conversations. If you are still interested in corpora have a look at the Sketch Engine which can provide how a word behaves grammatically as derived from 52 different language corpora each having at least 1 million words.

Voice of the Shuttle has a large list of links concerning literature (in English), ranging from Bohemian Ink (an online review of the history and future of experimental literature and poetry), Luminarium (an anthology of English literature from the medieval to the Restoration), the US-based National Association of Scholars, Let’s Talk (for storytellers), and whole set of additional lists on things ranging from Legal Studies to Dance. Looking at one section in more detail you have a whole page of links on Postindustrial Business Theory, and these point to sites such as the Economic History Association. However I did find that some of lists were rather superficial, and that some substantial link quality checking and maintenance was needed.

GrayNet is all about so-called gray-literature, e.g. things that are “published” outside the world of commercial publishing. Examples would include research reports, doctoral dissertations, some conference papers, etc. OpenGrey links to more than 700,000 bibliographic references across everything from aeronautics to space technology, and including the humanities, psychology and social sciences.

If you are a fan of Aesop’s Fables then this site “Aesopica” is the one for you. You have the English, Latin and Greek versions, illustrations from all periods, .. in fact everything you will ever need to know. I’m not sure if this site is regularly maintained, but it is still there and it all works. And if this has wet your appetite for Latin then check out the blog Bestiaria Latina.

The Bibliotheca Augustana is an e-book library of texts - Latina, Graeca, Germanica, Anglica, Gallica, Italica, Hispanica, Polonica, Russica, Iiddica, Lusitana, and Bohemica, along with a virtual museum of architecture, sculpture, and painting, and collection of music, and even a film collection. I personally found this site very poorly designed, e.g. bad layout, colours and fonts, but it is certainly well maintained and very rich in content.

The Electronic Literature Collection is just that a collection of electronic books from Oct. 2006 through to Feb. 2011. It looks more like a pointer to resources, than a resource itself, but it has the merit to exist and to highlight modern “electronic” literature.

Wikiquote is a free quote compendium based upon more than 23,000 articles. The French Wikiquote has more than 33,000 citations from more than 7,000 articles. This is just two of the 46 languages used, including Lëtzebuergesch.

