text extracting

Creating database for text classfication

Internet seams the best choice because we are interested in choosing different types of data. The restriction for my test is that I want that the positive data contains only onelines(short jokes). The negative data in order to have a good classification has to have the same structure(short sentences).
