Determining timestamps to be associated with events in machine data
원문보기
IPC분류정보
국가/구분
United States(US) Patent
등록
국제특허분류(IPC7판)
G06F-007/00
G06F-017/30
출원번호
US-0929248
(2015-10-30)
등록번호
US-9922065
(2018-03-20)
발명자
/ 주소
Swan, Erik M.
Carasso, R. David
Das, Robin Kumar
Greene, Rory
Hall, Bradley
Mealy, Nicholas Christian
Murphy, Brian Philip
Sorkin, Stephen Phillip
Stechert, Andre David
Baum, Michael Joseph
출원인 / 주소
Splunk Inc.
대리인 / 주소
Wong & Rees LLP
인용정보
피인용 횟수 :
4인용 특허 :
39
초록▼
Methods and apparatus consistent with the invention provide the ability to organize, index, search, and present time series data based on searches. Time series data are sequences of time stamped records occurring in one or more usually continuous streams, representing some type of activity. In one e
Methods and apparatus consistent with the invention provide the ability to organize, index, search, and present time series data based on searches. Time series data are sequences of time stamped records occurring in one or more usually continuous streams, representing some type of activity. In one embodiment, time series data is organized into discrete events with normalized time stamps and the events are indexed by time and keyword. A search is received and relevant event information is retrieved based in whole or in part on the time indexing mechanism, keyword indexing mechanism, or statistical indices calculated at the time of the search.
대표청구항▼
1. A method, comprising: segmenting machine data stored on at least one storage device into a set of events that are searchable, each event in the set of events includes a portion of the machine data, wherein the portions of the machine data associated with at least a subset of events in the set of
1. A method, comprising: segmenting machine data stored on at least one storage device into a set of events that are searchable, each event in the set of events includes a portion of the machine data, wherein the portions of the machine data associated with at least a subset of events in the set of events includes time information;creating a timestamp for each event in the subset of events that includes time information by: iterating over known time stamp format patterns from a list of known time stamp format patterns to find a matching pattern in the time information, wherein each time stamp format pattern in the list represents a pattern that may occur in the time information which indicates where a time stamp may be extracted from, wherein the list is dynamically ordered and the matching pattern is moved to the front of the list;extracting a time value from the time information using the matching pattern; andassociating the timestamp with that event using the time value;for each event that does not contain time information in the included portion of machine data: determining a time stamp corresponding to that event from at least one other event in the set of events; andassociating the determined time stamp with the corresponding event;servicing time-based search queries across the set of events;wherein the method is performed by one or more computing devices. 2. The method of claim 1, wherein the set of events are in chronological order. 3. The method of claim 1, wherein determining a time stamp further comprises interpolating time stamps of one or more events preceding and following the event. 4. The method of claim 1, wherein the time stamp for each event that does not contain time information is determined using linear interpolation. 5. The method of claim 1, wherein when an event does not contain time information and the event preceding or following the event also does not contain time information, the time stamp is determined by using the next respective preceding or following event. 6. The method of claim 1, wherein the known time stamp format patterns are store in a list, and wherein the list is ordered by most frequently occurring to least frequently occurring. 7. The method of claim 1, wherein aggregating the machine data into a set of events includes identifying a domain for the machine data. 8. The method of claim 1, wherein aggregating the machine data into a set of events includes identifying boundaries between events using machine learning. 9. The method of claim 1, wherein aggregating the machine data into a set of events includes analyzing a portion of the machine data to detect the beginning and ending of events within the machine data. 10. The method of claim 1, wherein each of the events in the set of events includes an unaltered portion of the machine data. 11. The method of claim 1, further comprising determining the boundaries between events by detecting the beginning of each subsequent event. 12. The method of claim 1, further comprising indexing the events based on the time stamp. 13. The method of claim 1, wherein the collection of machine data is in a stream, and aggregating into events includes determining a rule for separating the stream based on one or more of: a source of the stream of data, a domain of the stream of data, a type of the stream of data, a signature of the stream of data, a punctuation analysis of the stream of data, or an analysis of repeating patterns within the stream of data. 14. The method of claim 1, wherein the collection of machine data is part of a server log, a portion of a log file, a transaction record, a recorded measurement, message bus traffic, network data, network management data, or an output of an application. 15. The method of claim 1, further comprising assigning each event in the set of events to a bucket having an associated time range that includes the time represented by the time stamp for the event. 16. The method of claim 1, wherein the collection of machine data is from logs from a plurality of different sources and the format associated with each source is determined so as to automatically extract the timestamp for each event in the subset of events that includes time information. 17. The method of claim 1, further comprising dividing event data in each event into segments. 18. The method of claim 1, wherein the events are time stamped events indexed according to time to facilitate search queries on the events using time-based operators. 19. The method of claim 1, further comprising performing entity extraction to identify semantic entities within the machine data of the time stamped events. 20. A non-transitory, computer-readable storage medium storing instructions, an execution of which in a computer system causes the computer system to perform operations comprising: segmenting machine data stored on at least one storage device into a set of events that are searchable, each event in the set of events includes a portion of the machine data, wherein the portions of the machine data associated with at least a subset of events in the set of events includes time information;creating a timestamp for each event in the subset of events that includes time information by: iterating over known time stamp format patterns from a list of known time stamp format patterns to find a matching pattern in the time information, wherein each time stamp format pattern in the list represents a pattern that may occur in the time information which indicates where a time stamp may be extracted from, wherein the list is dynamically ordered and the matching pattern is moved to the front of the list;extracting a time value from the time information using the matching pattern; andassociating the timestamp with that event using the time value;for each event that does not contain time information in the included portion of machine data: determining a time stamp corresponding to that event from at least one other event in the set of events; andassociating the determined time stamp with the corresponding event;servicing time-based search queries across the set of events. 21. The computer-readable storage medium of claim 20, wherein determining a time stamp further comprises interpolating time stamps of one or more events preceding and following the event. 22. The computer-readable storage medium of claim 20, wherein the time stamp for each event that does not contain time information is determined using linear interpolation. 23. The computer-readable storage medium of claim 20, wherein when an event does not contain time information and the event preceding or following the event also does not contain time information, the time stamp is determined by using the next respective preceding or following event. 24. The computer-readable storage medium of claim 20, wherein aggregating the machine data into a set of events further comprises using extraction to detect the beginning and ending of events within the machine data. 25. A computer system comprising: computer memory for storing machine data; anda processor for:segmenting machine data stored on at least one storage device into a set of events that are searchable, each event in the set of events includes a portion of the machine data, wherein the portions of the machine data associated with at least a subset of events in the set of events includes time information;creating a timestamp for each event in the subset of events that includes time information by: iterating over known time stamp format patterns from a list of known time stamp format patterns to find a matching pattern in the time information, wherein each time stamp format pattern in the list represents a pattern that may occur in the time information which indicates where a time stamp may be extracted from, wherein the list is dynamically ordered and the matching pattern is moved to the front of the list;extracting a time value from the time information using the matching pattern; andassociating the timestamp with that event using the time value;for each event that does not contain time information in the included portion of machine data: determining a time stamp corresponding to that event from at least one other event in the set of events; andassociating the determined time stamp with the corresponding event;servicing time-based search queries across the set of events. 26. The computer system of claim 25, wherein determining a time stamp further comprises interpolating time stamps of one or more events preceding and following the event. 27. The computer system of claim 25, wherein the time stamp for each event that does not contain time information is determined using linear interpolation. 28. The computer system of claim 25, wherein when an event does not contain time information and the event preceding or following the event also does not contain time information, the time stamp is determined by using the next respective preceding or following event. 29. The computer system of claim 25, wherein aggregating the machine data into a set of events includes using extraction to detect the beginning and ending of events within the machine data. 30. The computer system of claim 25, wherein aggregating the machine data into a set of events includes identifying boundaries between events using machine learning.
연구과제 타임라인
LOADING...
LOADING...
LOADING...
LOADING...
LOADING...
이 특허에 인용된 특허 (39)
Owen, James G.; Singh, Rajiv; Chen, Rong; Gahinet, Pascal, Analysis of a sequence of data in object-oriented environments.
Reed Drummond Shattuck ; Heymann Peter Earnshaw ; Mushero Steven Mark ; Jones Kevin Benard ; Oberlander Jeffrey Todd, Computer-based communication system and method using metadata defining a control-structure.
Ransil, Patrick W.; Martynov, Aleksey V.; Larson, James S.; Collette, James R.; Chu, Robert Wai-Chi; Saha, Partha, Method and apparatus for data partitioning and replication in a searchable data service.
Swan, Erik M.; Carasso, R. David; Das, Robin Kumar; Greene, Rory; Hall, Bradley; Mealy, Nicholas Christian; Murphy, Brian Philip; Sorkin, Stephen Phillip; Stechert, Andre David; Baum, Michael Joseph, Normalization of time stamps for event data.
Gerald D. Baulier ; Stephen M. Blott ; Benson L. Branch ; Thomas M. Cliff, Jr. ; Henry F. Korth ; Jonathan E. Polito ; Abraham Silberschatz ; Scott L. Speicher, Real-time event processing system for telecommunications and other applications.
Baum, Michael Joseph; Carasso, R. David; Das, Robin Kumar; Greene, Rory; Hall, Bradley; Mealy, Nicholas Christian; Murphy, Brian Philip; Sorkin, Stephen Phillip; Stechert, Andre David; Swan, Erik M., Time series search engine.
Baum, Michael Joseph; Carasso, R. David; Das, Robin Kumar; Greene, Rory; Hall, Bradley; Mealy, Nicholas Christian; Murphy, Brian Philip; Sorkin, Stephen Phillip; Stechert, Andre David; Swan, Erik M., Time series search in primary and secondary memory.
Baum, Michael J.; Carasso, David; Das, Robin K.; Greene, Rory; Hall, Brad; Mealy, Nick; Murphy, Brian; Sorkin, Stephen; Stechert, Andre; Swan, Erik M., Time series search with interpolated time stamp.
Swan, Erik M.; Carasso, R. David; Das, Robin Kumar; Greene, Rory; Hall, Bradley; Mealy, Nicholas Christian; Murphy, Brian Philip; Sorkin, Stephen Phillip; Stechert, Andre David; Baum, Michael Joseph, Application of search policies to searches on event data stored in persistent data structures.
Swan, Erik M.; Carasso, R. David; Das, Robin Kumar; Greene, Rory; Hall, Bradley; Mealy, Nicholas Christian; Murphy, Brian Philip; Sorkin, Stephen Phillip; Stechert, Andre David; Baum, Michael Joseph, Expiration of persistent data structures that satisfy search queries.
Baum, Michael Joseph; Carasso, R. David; Das, Robin Kumar; Greene, Rory; Hall, Bradley; Mealy, Nicholas Christian; Murphy, Brian Philip; Sorkin, Stephen Phillip; Stechert, Andre David; Swan, Erik M., Source differentiation of machine data.
Swan, Erik M.; Carasso, R. David; Das, Robin Kumar; Greene, Rory; Hall, Bradley; Mealy, Nicholas Christian; Murphy, Brian Philip; Sorkin, Stephen Phillip; Stechert, Andre David; Baum, Michael Joseph, Time stamp creation for event data.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.