org.knallgrau.utils.textcat
Class FingerPrint

java.lang.Object
  extended by java.util.Dictionary<K,V>
      extended by java.util.Hashtable<java.lang.String,java.lang.Integer>
          extended by org.knallgrau.utils.textcat.FingerPrint
All Implemented Interfaces:
java.io.Serializable, java.lang.Cloneable, java.util.Map<java.lang.String,java.lang.Integer>

public class FingerPrint
extends java.util.Hashtable<java.lang.String,java.lang.Integer>

See Also:
Serialized Form

Constructor Summary
FingerPrint()
           
FingerPrint(java.io.InputStream is)
          creates a FingerPrint by reading it with the passed InputStream
FingerPrint(java.lang.String file)
          creates a FingerPrint by reading the FingerPrint-file referenced by the passed path.
 
Method Summary
 java.util.Map<java.lang.String,java.lang.Integer> categorize(java.util.Collection<FingerPrint> categories)
          categorizes the FingerPrint by computing the distance to the FingerPrints in the passed Collection. the category of the FingerPrint with the lowest distance is assigned to this FingerPrint.
 void create(java.io.File file)
          creates a FingerPrint by analysing the content of the given file.
 void create(java.lang.String text)
          fills the FingerPrint with all the NGrams and their numer of occurences in the passed text.
 java.lang.String getCategory()
          returns the category of the FingerPrint or "unknown" if the FingerPrint wasn't categorized yet.
 java.util.Map<java.lang.String,java.lang.Integer> getCategoryDistances()
           
 int getPosition(java.lang.String key)
          gets the position of the NGram passed to method in the FingerPrint. the NGrams are in descending order according to the number of occurences in the text which was used creating the FingerPrint.
 void save()
          saves the fingerprint to a file named .lm in the execution path.
 java.lang.String toString()
          returns the FingerPrint as a String in the FingerPrint file-format
 
Methods inherited from class java.util.Hashtable
clear, clone, contains, containsKey, containsValue, elements, entrySet, equals, get, hashCode, isEmpty, keys, keySet, put, putAll, remove, size, values
 
Methods inherited from class java.lang.Object
getClass, notify, notifyAll, wait, wait, wait
 

Constructor Detail

FingerPrint

public FingerPrint()

FingerPrint

public FingerPrint(java.lang.String file)
creates a FingerPrint by reading the FingerPrint-file referenced by the passed path.

Parameters:
file - path to the FingerPrint-file

FingerPrint

public FingerPrint(java.io.InputStream is)
creates a FingerPrint by reading it with the passed InputStream

Parameters:
is - InputStream for reading the FingerPrint
Method Detail

create

public void create(java.io.File file)
creates a FingerPrint by analysing the content of the given file.

Parameters:
file - file to be analysed

create

public void create(java.lang.String text)
fills the FingerPrint with all the NGrams and their numer of occurences in the passed text.

Parameters:
text - text to be analysed

categorize

public java.util.Map<java.lang.String,java.lang.Integer> categorize(java.util.Collection<FingerPrint> categories)
categorizes the FingerPrint by computing the distance to the FingerPrints in the passed Collection. the category of the FingerPrint with the lowest distance is assigned to this FingerPrint.

Parameters:
categories -

getCategoryDistances

public java.util.Map<java.lang.String,java.lang.Integer> getCategoryDistances()

getPosition

public int getPosition(java.lang.String key)
gets the position of the NGram passed to method in the FingerPrint. the NGrams are in descending order according to the number of occurences in the text which was used creating the FingerPrint.

Parameters:
key - the NGram
Returns:
the position of the NGram in the FingerPrint

save

public void save()
saves the fingerprint to a file named .lm in the execution path.


getCategory

public java.lang.String getCategory()
returns the category of the FingerPrint or "unknown" if the FingerPrint wasn't categorized yet.

Returns:
the category of the FingerPrint

toString

public java.lang.String toString()
returns the FingerPrint as a String in the FingerPrint file-format

Overrides:
toString in class java.util.Hashtable<java.lang.String,java.lang.Integer>