Multivariate, Text, Domain-Theory. Classification, Clustering.
LYMPHOMA Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Alizadeh, Michael B.
Eric Davis, Chi Ma, Izidore S. Lossos, Andreas Rosenwald, Jennifer C. Boldrick, Hajeer Sabet, Truc Tran, Xin Yu, John I. Powell, Liming Yang, Gerald E. Marti, Troy Moore, James Hudson Jr, Lisheng Lu, David B.
Lewis, Robert Tibshirani, Gavin Sherlock, Wing C. Chan, Timothy C.
Greiner, Dennis D. Weisenburger, James O. Armitage, Roger Warnke, Ronald Levy, Wyndham Wilson, Michael R. Grever, John C.
Byrd, David Botstein, Patrick O. Brown & Louis M. Staudt., VOL 403, Nº 3, pp. 503-511, February 2000. LEUKEMIA Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring. Gaasenbeek, J. Caligiuri, C.
Bloomfield, E., VOL 286, pp. 531-537, 15 October 1999. GLOBAL CANCER MAP Multiclass cancer diagnosis using tumor gene expression signatures. Ramaswamy, P.
Mukherjee, C.-H. Latulippe, J.P.
Lander and T.R., VOL 98, nº 26, pp. 4, December 18, 2001. DISCOVERY CHALLENGE ECML 2004 The dataset was prepared by downloading and processing verious information from the SAGEmap website as of December 2002. EMBRYONAL TUMOURS OF THE CENTRAL NERVOUS SYSTEM Prediction of Cent ral Ne rvous System Embryonal Tumour Outcome based on Gene Expression. Pomeroy, Pablo Tamayo, Michelle Gaasenbeek, Lisa M. Sturla, Michael Angelo, Margaret E.
McLaughlin, John Y. Kim, Liliana C. Goumnerova, Peter M.
Black, Ching Lau, Jeffrey C. Allen, David Zagzag, James M. Olson, Tom Curran, Cynthia Wetmore, Jaclyn A. Biegel, Tomaso Poggio, Shayan Mukherjee, Ryan Rifkin, Andrea Califano, Gustavo Stolovitzky, David N.
Louis, Jill P. Mesirov, Eric S. Lander & Todd R. Golub., VOL 415, pp. 436-442, 24 January 2002.
COLON CANCER Broad patterns of gene expression revealed by clustering of tumor and normal colon tissues probed by oligonucleotide arrays. Notterman, K. Levine., VOL 96, Issue 12, pp. 6745-6750, 8 June 1999. YEAST Systematic determination of genetic network architecture. Tavazoie, J.D. Campbell, R.J., 1999 Jul;22(3):281-5.
Data also used in Biclustering of Expression Data, by Yizong Cheng and George M. Church ( ). From: STATE FAILURE Advanced Data- and Knowledge-Driven Methods for State Failure Risk Assessment State Failure Task Force-III. (ARFF) 2.5Mb. 10.4Mb.
Content ARFF datasets The ELF reader for ARFF files supports only categorical features, where all entries are defined in the attribute section. For example when the value '?' Occur in the data section and it is not defined for this attribute, the data-readin would fail.
The file settings.txt contains the dataset name of train and test set and the name of the target column. Please note that the test data must also contain target values. For example: train=UCI/diabetes.arff test=UCI/diabetesTest.arff trainTargetColumn='class' The ARFF reader works for the following datasets from (first jar file from page). We have a preconfigured directory with arff files. anneal.arff.
balance-scale.arff. credit-g.arff.
diabetes.arff. glass.arff. heart-statlog.arff. ionosphere.arff.
iris.arff. kr-vs-kp.arff. letter.arff. lymph.arff. segment.arff. sonar.arff.
![Dataset Dataset](/uploads/1/2/5/6/125645476/117450387.jpg)
splice.arff. vehicle.arff. vowel.arff.
waveform-5000.arff. zoo.arff.