IPC분류정보
국가/구분 |
United States(US) Patent
등록
|
국제특허분류(IPC7판) |
|
출원번호 |
UP-0771791
(2007-06-29)
|
등록번호 |
US-7849399
(2011-01-31)
|
발명자
/ 주소 |
|
대리인 / 주소 |
|
인용정보 |
피인용 횟수 :
6 인용 특허 :
64 |
초록
▼
A method and system for tracking authorship of content in data is described, wherein the method and system may be employed in collaborative text editing systems or in word processing applications to identify and track the contributions of individual authors. The method comprises aligning at least a
A method and system for tracking authorship of content in data is described, wherein the method and system may be employed in collaborative text editing systems or in word processing applications to identify and track the contributions of individual authors. The method comprises aligning at least a portion of data from old or reference data with at least a portion of the data from new or target data, repeating the acts of aligning at least a portion of the data, storing any aligned data until no significant alignment of the data is obtained, and storing any unaligned data and authorship information.
대표청구항
▼
What is claimed is: 1. A computer implemented method to link authorship information to substrings of an electronic target string of symbols, wherein said electronic target string differs from an electronic reference string of symbols, the method comprising the steps of: splitting the electronic ref
What is claimed is: 1. A computer implemented method to link authorship information to substrings of an electronic target string of symbols, wherein said electronic target string differs from an electronic reference string of symbols, the method comprising the steps of: splitting the electronic reference string of symbols into a reference sequence of substrings and splitting the electronic target string of symbols into a target sequence of substrings; arranging the substrings of the target sequence and the substrings of the reference sequence in a sequence alignment; determining aligned and unaligned substrings in the target sequence and the reference sequence; excluding at least one of the aligned substrings from subsequent steps of arranging the substrings in a sequence alignment; repeating the steps of arranging the substrings in a sequence alignment, determining aligned and unaligned substrings, and excluding the aligned substrings until no significant alignment can be obtained; and linking authorship information associated with the electronic target string of symbols to the unaligned substrings of the target sequence. 2. The method of claim 1, wherein the step of arranging the substrings in a sequence alignment comprises employing a local sequence alignment algorithm, wherein alignments capture the local compactness of natural language. 3. The method of claim 1, wherein the aligned substrings are replaced with control information, such that the control information can modulate subsequent steps of arranging the substrings in a sequence alignment. 4. The method of claim 1, wherein the step of linking authorship information to the unaligned substrings further comprises storing in an electronic archive at least the unaligned substrings of the target sequence and the authorship information linked to the unaligned substrings; and the step of splitting the electronic strings of symbols, further comprises: generating the electronic reference sequence of substrings from the electronic archive, wherein the authorship of unaligned substrings is tracked through a plurality of edit cycles. 5. The method of claim 2, wherein the local sequence alignment algorithm comprises an adaptation of the Smith-Waterman algorithm. 6. The method of claim 1, wherein excluded substrings influence a calculation of an alignment score of subsequent alignments. 7. The method of claim 1, wherein the step of arranging the substrings in a sequence alignment comprises the steps of: determining an alignment score for a possible alignment; and verifying the alignment score exceeds an expected alignment score. 8. The method of claim 1, wherein the step of splitting the electronic strings of symbols into sequences of substrings further comprises detecting substrings based on at least one of a punctuation mark, blank, symbol, markup language, formatting instructions and elements of a database record. 9. The method of claim 1, wherein the step of arranging the substrings in a sequence alignment further comprises the step of: determining the similarity of pairs of aligned substrings based on at least one of an edit distance measure, electronic thesaurus and electronic ontology. 10. The method of claim 1, wherein the step of arranging the substrings in a sequence alignment further comprises the step of: evaluating the integrity of sentences, paragraphs and formatting instructions when determining a significant sequence alignment. 11. The method of claim 2, wherein the local sequence alignment algorithm comprises a heuristic approximation of the Smith-Waterman algorithm. 12. The method of claim 1, wherein the method further comprises the step of: displaying the substrings with the linked authorship information on a display device to a user. 13. The method of claim 1, wherein the method further comprises the step of: displaying the unaligned substrings of the reference sequence on a display device to a user, such that the substrings are visually indicated as deleted. 14. The method of claim 1, wherein the step of arranging the substrings in a sequence alignment further comprises the steps of: assigning a weight to at least one of the substrings; and determining a significant alignment based on the weight of at least one of the substrings in the alignment. 15. The method of claim 1, wherein the step of linking authorship information to the unaligned substrings further comprises storing in an electronic document containing markup language at least the unaligned substrings of the target sequence and the authorship information linked to the unaligned substrings; and the step of splitting the electronic strings of symbols, further comprises generating the electronic reference sequence of substrings from the electronic document containing markup language. 16. A computer-readable medium comprising a computer program that comprises program code means to carry out the method to link authorship information to substrings of an electronic target string of symbols, wherein said electronic target string differs from an electronic reference string of symbols, according to claim 1, wherein said program runs on a computer. 17. A computer implemented system to link authorship information to substrings of an electronic target string of symbols, wherein said electronic target string differs from an electronic reference string of symbols, the system comprising: splitting means adapted to split the electronic reference string of symbols into a reference sequence of substrings and to split the electronic target string of symbols into a target sequence of substrings; sequence alignment means adapted to arrange the substrings of the target sequence and the substrings of the reference sequence in a sequence alignment; determining means adapted to determine aligned and unaligned substrings in the target sequence and the reference sequence; means adapted to exclude at least one of the aligned substrings during subsequent steps of arranging the substrings in a sequence alignment; and storing means adapted to link authorship information associated with the electronic target string of symbols to the unaligned substrings of the target sequence; wherein the sequence aligning means are adapted to repeat the steps of arranging the substrings in a sequence alignment, determining aligned and unaligned substrings, and excluding the aligned substrings until no significant alignment can be obtained. 18. The system of claim 17, wherein the sequence alignment means are further adapted to: employ a local sequence alignment algorithm, wherein alignments capture the local compactness of natural language. 19. The system of claim 17, wherein the aligned substrings are replaced with control information, such that the control information can modulate subsequent steps of arranging the substrings in a sequence alignment. 20. The system of claim 17, wherein the storing means are further adapted to: store, in an electronic archive, at least one of the unaligned substrings of the target sequence and the authorship information linked to the unaligned substrings; and the splitting means are further adapted to generate the electronic reference sequence of substrings from the electronic archive, wherein the authorship of unaligned substrings is tracked through a plurality of edit cycles. 21. The system of claim 18, wherein the local sequence alignment algorithm comprises an adaptation of the Smith-Waterman algorithm. 22. The system of claim 17, wherein excluded substrings influence a calculation of an alignment score of subsequent alignments. 23. The system of claim 17, wherein the sequence alignment means are further adapted to: determine an alignment score for a possible alignment; and verify the alignment score exceeds an expected alignment score. 24. The system of claim 17, wherein the splitting means are further adapted to detect substrings based on at least one of a punctuation mark, blank, symbol, markup language, formatting instructions and elements of a database record. 25. The system of claim 17, wherein the sequence alignment means are further adapted to determine the similarity of pairs of aligned substrings based on at least one of an edit distance measure, electronic thesaurus and electronic ontology. 26. The system of claim 17, wherein the sequence alignment means are further adapted to evaluate the integrity of sentences, paragraphs and formatting instructions when determining a significant sequence alignment. 27. The system of claim 18, wherein the local sequence alignment algorithm comprises a heuristic approximation of the Smith-Waterman algorithm. 28. The system of claim 17, further comprising a display means that in combination with the storage means are adapted to display the substrings with the stored authorship information to a user. 29. The system of claim 17, further comprising a display means adapted to: display the unaligned substrings of the reference sequence to a user, such that the substrings are visually indicated as deleted. 30. The system of claim 17, wherein the sequence alignment means are further adapted to: assign a weight to at least one of the substrings; and determine a significant alignment based on the weight of at least one of the substrings in the alignment. 31. The system of claim 17, wherein the storing means are further adapted to: store, in an electronic document containing markup language, at least the unaligned substrings of the target sequence and the authorship information linked to the unaligned substrings; and the splitting means are further adapted to generate the electronic reference sequence of substrings from the electronic document containing markup language.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.