org.galagosearch.core.parse
Class DocumentDataExtractor

java.lang.Object
  extended by org.galagosearch.tupleflow.StandardStep<Document,org.galagosearch.core.types.DocumentData>
      extended by org.galagosearch.core.parse.DocumentDataExtractor
All Implemented Interfaces:
org.galagosearch.tupleflow.Processor<Document>, org.galagosearch.tupleflow.Source<org.galagosearch.core.types.DocumentData>, org.galagosearch.tupleflow.Step

@InputClass(className="org.galagosearch.core.parse.Document")
@OutputClass(className="org.galagosearch.core.types.DocumentData")
public class DocumentDataExtractor
extends org.galagosearch.tupleflow.StandardStep<Document,org.galagosearch.core.types.DocumentData>

Copies a few pieces of metadata about a document (identifier, url, length) from a document object and stores them in a DocumentData tuple.

Author:
trevor

Field Summary
 
Fields inherited from class org.galagosearch.tupleflow.StandardStep
processor
 
Constructor Summary
DocumentDataExtractor()
           
 
Method Summary
 void process(Document document)
           
 
Methods inherited from class org.galagosearch.tupleflow.StandardStep
close, setProcessor
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

DocumentDataExtractor

public DocumentDataExtractor()
Method Detail

process

public void process(Document document)
             throws java.io.IOException
Specified by:
process in interface org.galagosearch.tupleflow.Processor<Document>
Specified by:
process in class org.galagosearch.tupleflow.StandardStep<Document,org.galagosearch.core.types.DocumentData>
Throws:
java.io.IOException


Copyright © 2009. All Rights Reserved.