Skip to content

Open Contracts Annotator Components

Key Questions

  1. How is the PDF loaded?
  2. The PDF is loaded in the Annotator.tsx component.
  3. Inside the useEffect hook that runs when the openedDocument prop changes, the PDF loading process is initiated.
  4. The pdfjsLib.getDocument function from the pdfjs-dist library is used to load the PDF file specified by openedDocument.pdfFile.
  5. The loading progress is tracked using the loadingTask.onProgress callback, which updates the progress state.
  6. Once the PDF is loaded, the loadingTask.promise is resolved, and the PDFDocumentProxy object is obtained.
  7. The PDFPageInfo objects are created for each page of the PDF using doc.getPage(i) and stored in the pages state.

  8. Where and how are annotations loaded?

  9. Annotations are loaded using the REQUEST_ANNOTATOR_DATA_FOR_DOCUMENT GraphQL query in the Annotator.tsx component.
  10. The useQuery hook from Apollo Client is used to fetch the annotator data based on the provided initial_query_vars.
  11. The annotator_data received from the query contains information about existing text annotations, document label annotations, and relationships.
  12. The annotations are transformed into ServerAnnotation, DocTypeAnnotation, and RelationGroup objects and stored in the pdfAnnotations state using setPdfAnnotations.

  13. Where is the PAWLs layer loaded?

  14. The PAWLs layer is loaded in the Annotator.tsx component.
  15. Inside the useEffect hook that runs when the openedDocument prop changes, the PAWLs layer is loaded using the getPawlsLayer function from api/rest.ts.
  16. The getPawlsLayer function makes an HTTP GET request to fetch the PAWLs data file specified by openedDocument.pawlsParseFile.
  17. The PAWLs data is expected to be an array of PageTokens objects, which contain token information for each page of the PDF.
  18. The loaded PAWLs data is then used to create PDFPageInfo objects for each page, which include the page tokens.

High-level Components Overview

  • The Annotator component is the top-level component that manages the state and data loading for the annotator.
  • It renders the PDFView component, which is responsible for displaying the PDF and annotations.
  • The PDFView component renders various sub-components, such as LabelSelector, DocTypeLabelDisplay, AnnotatorSidebar, AnnotatorTopbar, and PDF.
  • The PDF component renders individual Page components for each page of the PDF.
  • Each Page component renders Selection and SearchResult components for annotations and search results, respectively.
  • The AnnotatorSidebar component displays the list of annotations, relations, and a search widget.
  • The PDFStore and AnnotationStore are context providers that hold the PDF and annotation data, respectively.

Specific Component Deep Dives

PDFView.tsx

The PDFView component is a top-level component that renders the PDF document with annotations, relations, and text search capabilities. It manages the state and functionality related to annotations, relations, and user interactions. Here's a detailed explanation of how the component works:

  1. The PDFView component receives several props, including permissions, callbacks for CRUD operations on annotations and relations, refs for container and selection elements, and various configuration options.

  2. It initializes several state variables using the useState hook, including:

  3. selectionElementRefs and searchResultElementRefs: Refs for annotation selections and search results.
  4. pageElementRefs: Refs for individual PDF pages.
  5. scrollContainerRef: Ref for the scroll container.
  6. textSearchMatches and searchText: State for text search matches and search text.
  7. selectedAnnotations and selectedRelations: State for currently selected annotations and relations.
  8. pageSelection and pageSelectionQueue: State for current page selection and queued selections.
  9. pdfPageInfoObjs: State for PDF page information objects.
  10. Various other state variables for active labels, relation modal visibility, and annotation options.

  11. The component defines several functions for updating state and handling user interactions, such as:

  12. insertSelectionElementRef, insertSearchResultElementRefs, and insertPageRef: Functions to add refs for selections, search results, and pages.
  13. onError: Error handling callback.
  14. advanceTextSearchMatch and reverseTextSearchMatch: Functions to navigate through text search matches.
  15. onRelationModalOk and onRelationModalCancel: Callbacks for relation modal actions.
  16. createMultiPageAnnotation: Function to create a multi-page annotation from queued selections.

  17. The component uses the useEffect hook to handle side effects, such as:

  18. Setting the scroll container ref on load.
  19. Listening for changes in the shift key and triggering annotation creation.
  20. Updating text search matches when the search text changes.

  21. The component renders the PDF document and its related components using the PDFStore and AnnotationStore contexts:

  22. The PDFStore context provides the PDF document, pages, and error handling.
  23. The AnnotationStore context provides annotation-related state and functions.

  24. The component renders the following main sections:

  25. LabelSelector: Allows the user to select the active label for annotations.
  26. DocTypeLabelDisplay: Displays the document type labels.
  27. AnnotatorSidebar: Sidebar component for managing annotations and relations.
  28. AnnotatorTopbar: Top bar component for additional controls and options.
  29. PDF: The actual PDF component that renders the PDF pages and annotations.

  30. The PDF component, defined in PDF.tsx, is responsible for rendering the PDF pages and annotations. It receives props from the PDFView component, such as permissions, configuration options, and callbacks.

  31. The PDF component maps over each page of the PDF document and renders a Page component for each page, passing the necessary props.

  32. The Page component, also defined in PDF.tsx, is responsible for rendering a single page of the PDF document along with its annotations and search results. It handles mouse events for creating and modifying annotations.

  33. The PDFView component also renders the RelationModal component when the active relation label is set and the user has the necessary permissions. The modal allows the user to create or modify relations between annotations.

PDF.tsx

PDF renders the actual PDF document with annotations and text search capabilities. PDFView (see above) is what actually interacts with the backend / API.

  1. The PDF component receives several props:
  2. shiftDown: Indicates whether the Shift key is pressed (optional).
  3. doc_permissions and corpus_permissions: Specify the permissions for the document and corpus, respectively.
  4. read_only: Determines if the component is in read-only mode.
  5. show_selected_annotation_only: Specifies whether to show only the selected annotation.
  6. show_annotation_bounding_boxes: Specifies whether to show annotation bounding boxes.
  7. show_annotation_labels: Specifies the behavior for displaying annotation labels.
  8. setJumpedToAnnotationOnLoad: A callback function to set the jumped-to annotation on load.
  9. The PDF component retrieves the PDF document and pages from the PDFStore context.
  10. It maps over each page of the PDF document and renders a Page component for each page, passing the necessary props.
  11. The Page component is responsible for rendering a single page of the PDF document along with its annotations and search results.
  12. Inside the Page component:
  13. It creates a canvas element using the useRef hook to render the PDF page.
  14. It retrieves the annotations for the current page from the AnnotationStore context.
  15. It defines a ConvertBoundsToSelections function that converts the selected bounds to annotations and tokens.
  16. It uses the useEffect hook to set up the PDF page rendering and event listeners for resizing and scrolling.
  17. It renders the PDF page canvas, annotations, search results, and queued selections.
  18. The Page component renders the following sub-components:
  19. PageAnnotationsContainer: A styled container for the page annotations.
  20. PageCanvas: A styled canvas element for rendering the PDF page.
  21. Selection: Represents a single annotation selection on the page.
  22. SearchResult: Represents a search result on the page.
  23. The Page component handles mouse events for creating and modifying annotations:
  24. On mouseDown, it initializes the selection if the necessary permissions are granted and the component is not in read-only mode.
  25. On mouseMove, it updates the selection bounds if a selection is active.
  26. On mouseUp, it adds the completed selection to the pageSelectionQueue and triggers the creation of a multi-page annotation if the Shift key is not pressed.
  27. The Page component also handles fetching more annotations for previous and next pages using the FetchMoreOnVisible component.
  28. The SelectionBoundary and SelectionTokens components are used to render the annotation boundaries and tokens, respectively.
  29. The PDFPageRenderer class is responsible for rendering a single PDF page on the canvas. It manages the rendering tasks and provides methods for canceling and rescaling the rendering.
  30. The getPageBoundsFromCanvas function calculates the bounding box of the page based on the canvas dimensions and its parent container.