SphKmeans

How to run

To build, make. Only tested on Linux.

First need to run python parser.py to ensure that the file is preprocessed correctly. Then can run the algorithm on the input file. Takes roughly 15mins to run. I have already run this step so the files should be in the directory.

Preprocessing generates char3, char5, and char7 representations along with bag-of-words representation of the reuters21578 newspaper article dataset. After preprocessing can run spherical k-means.

To run, ./sphkmeans <input-file> <class-file> <K> <max-iterations> <output-file> Takes about a minute to run.

Example usage ./sphkmeans bag_of_words.csv reuters21578.class 20 20 output.out

Breif Description

Spherical K-means is a clustering algorithm, i.e. an algorithm that tries to group data into K categories. The data is a vector in N-dimensions and the algorithm essentially first picks K random vectors as centroid belonging to their own unique cluster, then assigns each of the N vectors to its closest centroid's cluster - where distance is measured using cosine similarity - next new centroids are found by averaging all the vectors in each cluster, and the process repeats until the clusters stop changing or max-iterations are exceeded.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
reuters21578		reuters21578
Makefile		Makefile
README		README
README.md		README.md
README.md~		README.md~
bag_of_words.clabel		bag_of_words.clabel
bag_of_words.csv		bag_of_words.csv
char3.clabel		char3.clabel
char3.csv		char3.csv
char5.clabel		char5.clabel
char5.csv		char5.csv
char7.clabel		char7.clabel
char7.csv		char7.csv
log		log
output.out		output.out
parser.py		parser.py
reuters21578.class		reuters21578.class
run.py		run.py
spec_chars.py		spec_chars.py
spec_chars.pyc		spec_chars.pyc
sphkmeans		sphkmeans
sphkmeans.c		sphkmeans.c
test.c		test.c
types.c		types.c
types.h		types.h
utils.c		utils.c
utils.h		utils.h
wrappers.c		wrappers.c
wrappers.h		wrappers.h

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

SphKmeans

How to run

Breif Description

About

Uh oh!

Releases

Packages

Languages

different-error/SphKmeans

Folders and files

Latest commit

History

Repository files navigation

SphKmeans

How to run

Breif Description

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages