annotation-query-spark3 1.0.7 API - com.elsevier.aq.query

class After extends AnyRef

Provide the ability to find annotations that are after another annotation.

Provide the ability to find annotations that are after another annotation. The input is 2 Datasets of AQAnnotations. We will call them A and B. The purpose is to find those annotations in A that are after B. What that means is the start offset for an annotation from A must be after the end offset from an annotation in B. We of course have to also match on the document id. We ultimately return the A annotations that meet this criteria. A distance operator can also be optionally specified. This would require an A annotation (startOffset) to occur n characters (or less) after the B annotation (endOffset). There is also the option of negating the query (think Not After) so that we return only A where it is not after B.

class And extends AnyRef

Provide the ability to find annotations that are in the same document.

Provide the ability to find annotations that are in the same document. The input is 2 Datasets of AQAnnotations. We will call them A and B. The purpose is to find those annotations in A and B that are in the same document.

class Before extends AnyRef

Provide the ability to find annotations that are before another annotation.

Provide the ability to find annotations that are before another annotation. The input is 2 Datasets of AQAnnotations. We will call them A and B. The purpose is to find those annotations in A that are before B. What that means is the end offset for an annotation from A must be before the start offset from an annotation in B. We of course have to also match on the document id. We ultimately return the A annotations that meet this criteria. A distance operator can also be optionally specified. This would require an A annotation (endOffset) to occur n characters (or less) before the B annotation (startOffset). There is also the option of negating the query (think Not Before) so that we return only A where it is not before B.

class Between extends AnyRef

Provide the ability to find annotations that are before one annotation and after another.

Provide the ability to find annotations that are before one annotation and after another. The input is 3 Datasets of AQAnnotations. We will call them A, B and C. The purpose is to find those annotations in A that are before B and after C. What that means is the end offset for an annotation from A must be before the start offset from an annotation in B and the start offset for A be after the end offset from C. We of course have to also match on the document id. We ultimately return the A annotations that meet this criteria. A distance operator can also be optionally specified. This would require an A annotation (endOffset) to occur n characters (or less) before the B annotation (startOffset) and would require the A annotation (startOffset) to occur n characters (or less) after the C annotation (endOffset) . There is also the option of negating the query (think Not Between) so that we return only A where it is not before B nor after C.

class ContainedIn extends AnyRef

Provide the ability to find annotations that are contained by another annotation.

Provide the ability to find annotations that are contained by another annotation. The input is 2 Datasets of AQAnnotations. We will call them A and B. The purpose is to find those annotations in A that are contained in B. What that means is the start/end offset for an annotation from A must be contained by the start/end offset from an annotation in B. We of course have to also match on the document id. We ultimately return the contained annotations (A) that meet this criteria. There is also the option of negating the query (think Not Contains) so that we return only A where it is not contained in B.

class ContainedInList extends AnyRef

Provide the ability to find annotations that are contained by another annotation.

Provide the ability to find annotations that are contained by another annotation. The input is 2 Datasets of AQAnnotations. We will call them A and B. The purpose is to find those annotations in A that are contained in B. What that means is the start/end offset for an annotation from A must be contained by the start/end offset from an annotation in B. We of course have to also match on the document id. We ultimately return a Dataset with 2 fields where the first field is an annotation from B and the second field is an array of entries from A that are contained in the first entry.

class Contains extends AnyRef

Provide the ability to find annotations that contain another annotation.

Provide the ability to find annotations that contain another annotation. The input is 2 Datasets of AQAnnotations. We will call them A and B. The purpose is to find those annotations in A that contain B. What that means is the start/end offset for an annotation from A must contain the start/end offset from an annotation in B. We of course have to also match on the document id. We ultimately return the container annotations (A) that meet this criteria. We also deduplicate the A annotations as there could be many annotations from B that could be contained by an annotation in A but it only makes sense to return the unique container annotations. There is also the option of negating the query (think Not Contains) so that we return only A where it does not contain B.

class FilterProperty extends AnyRef

Provide the ability to filter a property field with a specified value in a Dataset of AQAnnotations.

Provide the ability to filter a property field with a specified value in a Dataset of AQAnnotations. A single value or an array of values can be used for the filter comparison.

class FilterSet extends AnyRef

Provide the ability to filter the annotation set field in a Dataset of AQAnnotations.

class FilterType extends AnyRef

Provide the ability to filter the annotation type field in a Dataset of AQAnnotations.

class Following extends AnyRef

Return the followng sibling annotations for every annotation in the anchor Dataset[AQAnnotations].

Return the followng sibling annotations for every annotation in the anchor Dataset[AQAnnotations]. The following sibling annotations can optionally be required to be contained in a container Dataset[AQAnnotations]. The return type of this function is different from other functions. Instead of returning a Dataset[AQAnnotation] this function returns a Dataset[(AQAnnotation,Array[AQAnnotation])].

class MatchProperty extends AnyRef

Provide the ability to find annotations (looking at their property) that are in the same document.

Provide the ability to find annotations (looking at their property) that are in the same document. The input is 2 Datasets of AQAnnotations. We will call them A and B. The purpose is to find those annotations in A that are in the same document as B and also match values on the specified property.

class Or extends AnyRef

Provide the ability to combine (union) annotations.

Provide the ability to combine (union) annotations. The input is 2 Datasets of AQAnnotations. The output is the union of these annotations.

class Preceding extends AnyRef

Return the preceding sibling annotations for every annotation in the anchor Dataset[AQAnnotations].

Return the preceding sibling annotations for every annotation in the anchor Dataset[AQAnnotations]. The preceding sibling annotations can optionally be required to be contained in a container Dataset[AQAnnotations]. The return type of this function is different from other functions. Instead of returning a Dataset[AQAnnotation] this function returns a Dataset[(AQAnnotation,Array[AQAnnotation])].

class RegexProperty extends AnyRef

Provide the ability to filter a property field using a regex expression in a Dataset of AQAnnotations.

class RegexTokensSpan extends AnyRef

Provides the ability to apply a regular expression to the concatenated string generated by TokensSpan.

Provides the ability to apply a regular expression to the concatenated string generated by TokensSpan. For the strings matching the regex, a Dataset[AQAnnotations] will be returned. The AQAnnotation will correspond to the offsets within the concatenated string containing the match.

class Sequence extends AnyRef

Provide the ability to find annotations that are before another annotation.

Provide the ability to find annotations that are before another annotation. The input is 2 Datasets of AQAnnotations. We will call them A and B. The purpose is to find those annotations in A that are before B. What that means is the end offset for an annotation from A must be before the start offset from an annotation in B. We of course have to also match on the document id. We ultimately return the annotations that meet this criteria. Unlike the Before function, we adjust the returned annotation a bit. For example, we set the annotType to "seq" and we use the A startOffset and the B endOffset. A distance operator can also be optionally specified. This would require an A annotation (endOffset) to occur n characters (or less) before the B annotation (startOffset).

class TokensSpan extends AnyRef

Provides the ability to create a string from a list of tokens that are contained in a span.

Provides the ability to create a string from a list of tokens that are contained in a span. The specified tokenProperty is used to extract the values from the tokens when creating the string. For SCNLP, this tokenProperty could be values like 'orig', 'lemma', or 'pos'. The spans would typically be a SCNLP 'sentence' or could even be things like an OM 'ce:para'. Returns a Dataset[AQAnnotation] spans with 3 new properties all prefixed with the specified tokenProperty value followed by (ToksStr, ToksSpos, ToksEpos) The ToksStr property will be the concatenated string of token property values contained in the span. The ToksSPos and ToksEpos are properties that will help us determine the start/end offset for each of the individual tokens in the ToksStr. These helper properties are needed for the function RegexTokensSpan so we can generate accurate accurate start/end offsets based on the str file.

Packages

query

package query

Type Members

Ungrouped

Packages

query 

package query

Type Members

Ungrouped

query