|
|||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectorg.apache.lucene.util.AttributeSource
org.apache.lucene.analysis.TokenStream
org.apache.uima.lucas.indexer.analysis.AnnotationTokenStream
public class AnnotationTokenStream
AnnotationTokenStream represents a TokenStream which extracts tokens from feature values of annotations of a given type from a JCas object. Each token has the start and end offset from the annotation object. This class supports only the following UIMA JCas types of features:
Nested Class Summary |
---|
Nested classes/interfaces inherited from class org.apache.lucene.util.AttributeSource |
---|
org.apache.lucene.util.AttributeSource.AttributeFactory, org.apache.lucene.util.AttributeSource.State |
Constructor Summary | |
---|---|
AnnotationTokenStream(org.apache.uima.jcas.JCas jCas,
String sofaName,
String typeName)
Creates a TokenStream which extracts all coveredText feature values of annotations of a given type from a JCas object. |
|
AnnotationTokenStream(org.apache.uima.jcas.JCas jCas,
String sofaName,
String typeName,
List<String> featureNames,
Map<String,Format> featureFormats)
Creates a TokenStream which extracts all feature values of a given feature name list from annotations with a given type from a given JCas object. |
|
AnnotationTokenStream(org.apache.uima.jcas.JCas jCas,
String sofaName,
String typeName,
List<String> featureNames,
String delimiter,
Map<String,Format> featureFormats)
Creates a TokenStream which extracts all feature values of a given feature name list from annotations with a given type from a given JCas object. |
|
AnnotationTokenStream(org.apache.uima.jcas.JCas jCas,
String sofaName,
String typeName,
String featureName,
Format featureFormat)
Creates a TokenStream which extracts all feature values of a given feature name from annotations with a given type from a given JCas object. |
|
AnnotationTokenStream(org.apache.uima.jcas.JCas jCas,
String sofaName,
String typeName,
String featurePath,
List<String> featureNames,
Map<String,Format> featureFormats)
Creates a TokenStream which extracts all feature values of a given feature name list from annotations with a given type from a given JCas object. |
|
AnnotationTokenStream(org.apache.uima.jcas.JCas jCas,
String sofaName,
String typeName,
String featurePath,
List<String> featureNames,
String delimiter,
Map<String,Format> featureFormats)
Creates a TokenStream which extracts all feature values of a given feature name list from annotations with a given type from a given JCas object. |
Method Summary | |
---|---|
protected Iterator<org.apache.uima.cas.FeatureStructure> |
createFeatureStructureIterator(org.apache.uima.jcas.tcas.Annotation annotation,
String featurePath)
|
protected Iterator<String> |
createFeatureValueIterator(org.apache.uima.cas.FeatureStructure srcFeatureStructure,
Collection<String> featureNames)
|
org.apache.uima.cas.Type |
getAnnotationType()
|
String |
getDelimiter()
|
Map<String,Format> |
getFeatureFormats()
|
List<String> |
getFeatureNames()
|
String |
getFeaturePath()
|
org.apache.uima.jcas.JCas |
getJCas()
|
String |
getValueForFeature(org.apache.uima.cas.FeatureStructure featureStructure,
org.apache.uima.cas.Feature feature,
Format format)
|
protected void |
initializeIterators()
|
org.apache.lucene.analysis.Token |
next()
|
org.apache.lucene.analysis.Token |
next(org.apache.lucene.analysis.Token token)
|
void |
reset()
|
Methods inherited from class org.apache.lucene.analysis.TokenStream |
---|
close, end, getOnlyUseNewAPI, incrementToken, setOnlyUseNewAPI |
Methods inherited from class org.apache.lucene.util.AttributeSource |
---|
addAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, restoreState, toString |
Methods inherited from class java.lang.Object |
---|
clone, finalize, getClass, notify, notifyAll, wait, wait, wait |
Constructor Detail |
---|
public AnnotationTokenStream(org.apache.uima.jcas.JCas jCas, String sofaName, String typeName) throws InvalidTokenSourceException
jCas
- the jCassofaName
- the name of the subject of analysis (sofa)typeName
- the type of the annotation
org.apache.uima.cas.CASException
InvalidTokenSourceException
public AnnotationTokenStream(org.apache.uima.jcas.JCas jCas, String sofaName, String typeName, String featureName, Format featureFormat) throws InvalidTokenSourceException
jCas
- the JCas objectsofaName
- the name of the subject of analysis (sofa)typeName
- the type of the annotationfeatureName
- the name of the feature from which the token text is buildfeatureFormat
- optional format object to convert feature values to strings
InvalidTokenSourceException
public AnnotationTokenStream(org.apache.uima.jcas.JCas jCas, String sofaName, String typeName, List<String> featureNames, String delimiter, Map<String,Format> featureFormats) throws InvalidTokenSourceException
jCas
- the JCas objectsofaName
- the name of the Subject Of Analysis (sofa)typeName
- the type of the annotationfeatureNames
- the name of the feature from which the token text is builddelimiter
- a delimiter for concatenating the different feature values of an annotation object. If
null a white space will be used.featureFormats
- optional map of format objects to convert feature values to strings - the key must be
the feature name
InvalidTokenSourceException
public AnnotationTokenStream(org.apache.uima.jcas.JCas jCas, String sofaName, String typeName, List<String> featureNames, Map<String,Format> featureFormats) throws InvalidTokenSourceException
jCas
- the JCas objectsofaName
- the name of the Subject Of Analysis (sofa)typeName
- the type of the annotationfeatureNames
- the name of the feature from which the token text is buildfeatureFormats
- optional map of format objects to convert feature values to strings - the key must be
the feature name
InvalidTokenSourceException
public AnnotationTokenStream(org.apache.uima.jcas.JCas jCas, String sofaName, String typeName, String featurePath, List<String> featureNames, Map<String,Format> featureFormats) throws InvalidTokenSourceException
jCas
- the JCas objectsofaName
- the name of the Subject of Analysis (sofa)typeName
- the type of the annotationfeaturePath
- the path to the feature structures which features should be used for tokens Path
entries should be separated by ".". Example:
"affiliation.address.country"featureNames
- the name of the feature from which the token text is buildfeatureFormats
- optional map of format objects to convert feature values to strings - the key must be
the feature name
InvalidTokenSourceException
public AnnotationTokenStream(org.apache.uima.jcas.JCas jCas, String sofaName, String typeName, String featurePath, List<String> featureNames, String delimiter, Map<String,Format> featureFormats) throws InvalidTokenSourceException
jCas
- the JCas objectsofaName
- the name of the Subject of Analysis (sofa)typeName
- the type of the annotationfeaturePath
- the path to the feature structures which features should be used for tokens Path
entries should be separated by ".". Example:
"affiliation.address.country"featureNames
- the name of the feature from which the token text is builddelimiter
- a delimiter for concatenating the different feature values of an annotation object. If
null a white space will be used.featureFormats
- optional map of format objects to convert feature values to strings - the key must be
the feature name
InvalidTokenSourceException
Method Detail |
---|
public org.apache.lucene.analysis.Token next(org.apache.lucene.analysis.Token token) throws IOException
next
in class org.apache.lucene.analysis.TokenStream
IOException
public org.apache.lucene.analysis.Token next() throws IOException
next
in class org.apache.lucene.analysis.TokenStream
IOException
protected void initializeIterators()
protected Iterator<org.apache.uima.cas.FeatureStructure> createFeatureStructureIterator(org.apache.uima.jcas.tcas.Annotation annotation, String featurePath)
protected Iterator<String> createFeatureValueIterator(org.apache.uima.cas.FeatureStructure srcFeatureStructure, Collection<String> featureNames)
public String getValueForFeature(org.apache.uima.cas.FeatureStructure featureStructure, org.apache.uima.cas.Feature feature, Format format)
public void reset()
reset
in class org.apache.lucene.analysis.TokenStream
public Map<String,Format> getFeatureFormats()
public org.apache.uima.jcas.JCas getJCas()
public String getFeaturePath()
public List<String> getFeatureNames()
public String getDelimiter()
public org.apache.uima.cas.Type getAnnotationType()
|
|||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |