IPC분류정보
국가/구분 |
United States(US) Patent
등록
|
국제특허분류(IPC7판) |
|
출원번호 |
US-0094434
(2002-03-08)
|
우선권정보 |
GB-0107372.5(2001-03-23) |
발명자
/ 주소 |
- Durrant,Paul
- Hanson,Stephen R
- Gordon,David S
- Moiin,Hossein
|
출원인 / 주소 |
|
대리인 / 주소 |
Meyertons Hood Kivlin Kowert &
|
인용정보 |
피인용 횟수 :
35 인용 특허 :
26 |
초록
▼
A computer system compnses a processor (2), memory (4) and a plurality of devices (6, 8, 12), the processor (2) and the memory (4) being operable to effect the operation of a fault response processor (AFR), and a device driver (GRAPHICS, NETWORK, H2IO, IO2L, SERIAL) for each of the devices. The faul
A computer system compnses a processor (2), memory (4) and a plurality of devices (6, 8, 12), the processor (2) and the memory (4) being operable to effect the operation of a fault response processor (AFR), and a device driver (GRAPHICS, NETWORK, H2IO, IO2L, SERIAL) for each of the devices. The fault response processor (AFR) is operable to generate a model which represents the processor (2), the memory (4) and the devices (6, 8, 12) of the computer system and the inter-connection of the processor (2), memory (4) and the devices (GRAPHICS, NETWORK, H2IO, IO2L, SERIAL). The device driver (GRAPHICS, NETWORK, H2IO, IO2L, SERIAL) for each of the devices (6, 8, 12) is arranged, consequent upon a change of operational status of the device, to generate fault report data indicating whether the change of status was caused internally within the device or externally by another connected device. The devices of the computer system may be formed as a plurality of Field Replaceable Units (FRU). The fault response processor (AFR) is operable, consequent upon receipt of the fault reports from the device drivers (GRAPHICS, NETWORK, H2IO, IO2L, SERIAL) to estimate the location of a FRU containing a faulty device by applying the fault indication to the model. In other embodiments the fault report data includes direction information indicating a connection between the device and the other connected device which caused the external fault. Having identified the faulty device the FRU may be replaced, thereby minimizing down time of the computer system.
대표청구항
▼
What is claimed is: 1. A computer system comprising: a plurality of devices, a plurality of device drivers, each device driver operable to monitor an operational status of one of said plurality of devices, and a fault response processor operable to generate a model which represents the monitored de
What is claimed is: 1. A computer system comprising: a plurality of devices, a plurality of device drivers, each device driver operable to monitor an operational status of one of said plurality of devices, and a fault response processor operable to generate a model which represents the monitored devices of the computer system and an inter-connection of said monitored devices, wherein said device driver for each of said monitored devices further being operable, consequent upon a change of operational status of said monitored device, to generate fault report data including the operational status of the monitored device and a fault indication of whether the change of operational status of the monitored device was caused internally within the monitored device or externally by another connected device, wherein said fault response processor is operable, consequent upon receipt of said fault report data from said device drivers, to estimate a location of a faulty device by applying the operational status of one or more of the monitored devices and the fault indication corresponding to one or more of the monitored devices to said model, wherein said fault response processor is operable to pre-process said model by comparing the operational status information from fault report data associated with successively connected devices in a data path. 2. A computer system as claimed in claim 1, wherein said operational status of each of said monitored devices is at least one of up, indicating no fault, degraded, indicating that the monitored device is still operational but with impaired performance, or down, indicating that the monitored device is not operational. 3. A computer system as claimed in claim 1, wherein each of said device drivers are operable, if said fault report data indicates that said change of operational status was caused externally, to generate fault direction information indicative of a connection from which an external fault is perceived, wherein said fault response processor being operable to estimate the location of said faulty device by applying said fault direction information to said model. 4. A computer system as claimed in claim 3, wherein said fault response processor is operable to generate fault probability measures for one or more monitored devices in the model, wherein each fault probability measure is representative of a perceived likelihood that the monitored device is faulty, wherein the fault probability measures being generated by applying the fault direction information and the operational status information to the model, wherein said fault response processor is operable to compare the fault probability measures for the monitored devices in the model with a predetermined threshold, and consequent upon the comparison, to estimate the location of the faulty device from a result of the comparison. 5. A computer system as claimed in claim 4, wherein said fault response processor is operable, for each monitored device represented in said model having a plurality of fault probability measures associated with the monitored device, to combine the fault probability measures for the monitored device, wherein a combined fault probability measure being compared with said predetermined threshold to provide an estimated location of said faulty device. 6. A computer system as claimed in claim 1, wherein said monitored devices are arranged as a plurality of groups, each group containing one or more monitored devices of said computer system, and wherein an estimated location produced by said fault response processor is an estimate of one or more of said groups having a faulty device. 7. A computer system as claimed in claim 6, wherein environment data representative of a parameter value of at least one environment sensor is generated in association with one of said groups, wherein said fault response processor being operable to analyze said environment data in association with said estimate of one or more of said groups having a faulty device to generate an improved estimate of a location of a faulty group from said model. 8. A computer system as claimed in claim 7, wherein said groups comprising one or more of said monitored devices arranged to form Field Replaceable Units (FRUs). 9. A computer system as claimed in claim 8, wherein said at least one environment sensor is associated with at least one of said Field Replaceable Units (FRUs). 10. A computer system as claimed in claim 7, wherein said fault response processor is operable to generate diagnostic report data representative of said estimate of one or more of said groups having a faulty device and of said improved estimate of the location of the faulty group. 11. A computer system as claimed in claim 10, comprising a graphical user interface, wherein said fault response processor is operable to produce said diagnostic report data on said graphical user interface. 12. A computer system as claimed in claim 1, wherein said model is a device tree having at least two hierarchical levels into which said monitored devices are divided, wherein the monitored devices in each level being connected with a least one monitored device in a subsequent level, wherein each connection representing a data path. 13. A computer system as claimed in claim 1, wherein said fault response processor is operable to generate said model of said computer system from the fault report data, wherein said model representing potentially faulty devices of said computer system. 14. A fault response processor for use in estimating a location of at least one of a plurality of devices of a system which is faulty, said fault response processor being operable to: generate a data model having a structure which represents said plurality of devices and the inter-connection of said devices, receive fault report data generated by device drivers following a change in the operational status of one or more of the devices, wherein said fault report data including the operational status of the device and a fault indication of whether the change in the operational status was caused internally within the device or externally by another connected device, pre-process said model by comparing the operational status information from fault report data associated with successively connected devices in a data path, and estimate a location of a faulty device, within said model, by applying the operational status of one or more of the devices and the fault indication corresponding to one or more of the devices to the model. 15. A fault response processor as claimed in claim 14, wherein said operational status of said device is one of up, indicating no fault, degraded, indicating that the device is still operational but with impaired performance, or down, indicating that the device is not operational. 16. A fault response processor as claimed in claim 14, wherein if said fault report data indicates that said change of operational status was caused externally, the device drivers are operable to generate fault direction information indicative of a relative direction on a connection from which an external fault is perceived, wherein said fault response processor being operable to estimate the location of said faulty device by applying said fault direction information to said model. 17. A fault response processor as claimed in claim 16, wherein said fault response processor is operable to generate fault probability measures associated with one or more devices in the model, wherein each fault probability measure is representative of a perceived likelihood that the device is faulty, wherein the fault probability measures being generated by applying the fault direction information and the operational status information to the model, wherein said fault response processor is operable to compare the fault probability measures for the devices in the model with a predetermined threshold, and consequent upon the comparison, to estimate the location of the faulty device from a result of the comparison. 18. A fault response processor as claimed in claim 17, wherein said fault response processor is operable, for each device represented in said model having a plurality of fault probability measures associated with said device generated from said fault report data, to combine the fault probability measures for the device, wherein a combined fault probability measure being compared with said predetermined threshold to provide an estimated location of said faulty device. 19. A fault response processor as claimed in claim 17, wherein said devices are arranged as a plurality of groups, each group containing one or more devices of said computer system, and wherein said estimated location produced by said fault response processor is an estimate of one or more of said groups having a faulty device. 20. A fault response processor as claimed in claim 19, wherein environment data representative of a parameter value of at least one environment sensor is generated in association with a performance of one of said groups, wherein said fault response processor being operable to analyze said environment data in association with said estimate of one or more of said groups having a faulty device to generate an improved estimate of a location of a faulty group from said model. 21. A fault response processor as claimed in claim 20, wherein said groups comprising one or more of said devices are arranged to form Field Replaceable Units (FRUs). 22. A fault response processor as claimed in claim 20, wherein said fault response processor is operable to generate diagnostic report data representative of said estimate of one or more of said groups having a faulty device and of said improved estimate of the location of the faulty group. 23. A fault response processor as claimed in claim 22, wherein said fault response processor is operable to produce said diagnostic report data on a graphical user interface. 24. A fault response processor as claimed in claim 14, wherein said model is a device tree having at least two hierarchical levels into which said devices are divided, wherein the devices in each level being connected with at least one device in a subsequent level, wherein each connection representing a data path. 25. A fault response processor as claimed in claim 14, wherein said fault response processor is operable to generate said model of said system from the fault report data, wherein said model representing potentially faulty devices of said system. 26. A method of locating faulty devices in a system including a plurality of devices, said method comprising: monitoring an operational status of one or more of the plurality of devices; generating a model of said system, wherein the model includes a structure representing the plurality of monitored devices and the inter-connection of the monitored devices via at least one data path; generating fault report data consequent upon a change of operational status of at least one of said monitored devices, wherein said fault report data including the operational status of the monitored device and a fault indication of whether the change of operational status of the monitored device was caused internally within the monitored device or externally by another connected device; pre-processing said model by comparing the operational status information from fault report data associated with successively connected devices in a data path; estimating a location of a faulty device, within said model, by applying the operational status of one or more of the monitored devices and the fault indication corresponding to one or more of the monitored devices to the model. 27. A method of locating faulty devices as claimed in claim 26, wherein said operational status of each of said monitored devices is one of up, indicating no fault, degraded, indicating that the monitored device is still operational but with impaired performance, or down, indicating that the monitored device is not operational. 28. A method of locating faulty devices as claimed in claim 26, further comprising generating fault direction information indicative of a relative direction on a connection from which an external fault is perceived if said fault report data indicates that said change of operational status was caused externally, wherein said estimating the location of the faulty device comprising applying said fault direction information to said model. 29. A method as claimed in claim 26, further comprising comparing the operational status information from fault report data associated with successively connected devices in a data path, wherein if the operational status indicates that a preceding device on the data path is degraded or down, fault direction information is generated for the preceding device indicating that a fault is internal, and wherein if the operational status indicates that a succeeding device on the data path is down or degraded, fault direction information is generated for the succeeding device indicating that a fault is external, wherein said estimating the location of said faulty device comprises disregarding fault report data associated with said succeeding device and estimating the location of said faulty device from remaining fault report data. 30. A method as claimed in claim 26, further comprising: generating fault probability measures for one or more monitored devices in said model, wherein each fault probability measure is representative of a perceived likelihood that said monitored device is faulty, wherein said fault probability measures being generated by applying the fault direction information and the operational status information to the model, comparing said fault probability measures for the monitored devices in said model with a predetermined threshold, and consequent upon the comparison, estimating said location of said faulty device from a result of the comparison. 31. A method as claimed in claim 30, further comprising, for each monitored device represented in said model having a plurality of fault probability measures associated with said monitored device from said fault report data, combining the fault probability measures for the monitored device, and then comparing a combined fault probability measure with said predetermined threshold to provide an estimated location of said faulty device. 32. A method as claimed in claim 30, further comprising arranging said monitored devices as a plurality of groups, each group containing one or more monitored devices of said system, wherein said estimating the location of said faulty device provides an estimate of one or more of said groups having a faulty device. 33. A method as claimed in claim 32, further comprising: generating environment data representative of a parameter value of at least one sensor associated with a performance of at least one group of monitored devices, analyzing said environment data in association with said estimate of one or more of said groups having a faulty device to generate an improved estimate of a location of a faulty group from said model. 34. A method as claimed in claim 32, wherein said groups comprising one or more monitored devices arranged to form Field Replaceable Units (FRUs). 35. A method as claimed in claim 33, further comprising generating diagnostic report data representative of said estimate of one or more of said groups having a faulty device and of said improved estimate of the location of the faulty group. 36. A method as claimed in claim 35, wherein said generating said diagnostic report data includes producing said diagnostic report data on a graphical user interface. 37. A method as claimed in claim 26, wherein said model is a device tree having at least two hierarchical levels into which said monitored devices are divided, wherein the monitored devices in each level being connected with at least one monitored device in a subsequent level, wherein each connection representing a data path. 38. A method as claimed in claim 28, wherein said generating said model of said system, comprises: identifying the fault report data generated within a time epoch, and generating said model using said fault indication, said operational status information and said fault direction information, wherein said model representing potentially faulty devices of said system. 39. A computer readable storage medium comprising program instructions, wherein the program instructions are executable by a processor to: monitor an operational status of a plurality of devices; generate a model of a system, wherein the model includes a structure representing the plurality of monitored devices included in the system and the inter-connection of the monitored devices via at least one data path; generate fault report data consequent upon a change of operational status of at least one of said monitored devices, wherein said fault report data including the operational status of the monitored device and a fault indication of whether the change of operational status of the monitored device was caused internally within the monitored device or externally by another connected device; pre-process said model by comparing the operational status information from fault report data associated with successively connected devices in a data path; and estimate a location of a faulty device, within said model, by applying the operational status of one or more of the monitored devices and the fault indication corresponding to one or more of the monitored devices to the model. 40. A computer system comprising: a plurality of devices; a plurality of device drivers, each device driver operable to monitor an operational status of one of said plurality of devices; and a fault response processor operable to generate a model which represents the monitored devices of the computer system and an inter-connection of said monitored devices; wherein said device driver for each of said monitored devices being further operable, consequent upon a change of operational status of said monitored device, to generate fault report data including the operational status of the monitored device and a fault indication of whether the change of operational status was caused internally within the monitored device or externally by another connected device; wherein said fault response processor is operable, consequent upon receipt of said fault report data from said device drivers, to estimate a location of a faulty device by applying the operational status of one or more of the monitored devices and the fault indication corresponding to one or more of the monitored devices to said model; wherein said fault response processor is operable to pre-process said model by comparing the operational status information from fault report data associated with successively connected devices in a data path, wherein if the operational status for a preceding device on the data path has changed, fault direction information is generated for the preceding device indicating that a fault is internal, and wherein if the operational status for a succeeding device on the data path has changed, fault direction information is generated for the succeeding device indicating that a fault is external, wherein the fault report data associated with said succeeding device is disregarded in said estimation of the location of said faulty device. 41. A computer system comprising: a plurality of devices; a plurality of device drivers, each device driver operable to monitor an operational status of one of said plurality of devices; and a fault response processor operable to generate a model which represents the monitored devices of the computer system and an inter-connection of said monitored devices; wherein said device driver for each of said monitored devices being further operable, consequent upon a change of operational status of said monitored device, to generate fault report data including the operational status of the monitored device and a fault indication of whether the change of operational status was caused internally within the monitored device or externally by another connected device; wherein said fault response processor is operable, consequent upon receipt of said fault report data from said device drivers, to estimate a location of a faulty device by applying the operational status of one or more of the monitored devices and the fault indication corresponding to one or more of the monitored devices to said model; wherein said fault response processor is further operable to generate fault probability measures for one or more monitored devices in the model, wherein each fault probability measure is representative of a perceived likelihood that the monitored device is faulty, wherein the fault probability measures being generated by applying fault direction information and the operational status information to the model, wherein said fault response processor is operable to compare the fault probability measures for the monitored devices in the model with a predetermined threshold, and consequent upon the comparison, to estimate the location of the faulty device from the result of the comparison; wherein, for each monitored device represented in said model having a plurality of fault probability measures associated with the monitored device, said fault response processor is operable to combine the fault probability measures for the monitored device, wherein the combined fault probability measure being compared with said predetermined threshold to provide an estimated location of the faulty device. 42. The computer system as claimed in claim 41, wherein said fault response processor is operable to determine a rate of arrival of said fault report data and to define said analysis interval from at least one of a time at which said rate of arrival increases and a time at which said rate of arrival decreases. 43. A fault response processor for use in estimating a location of at least one of a plurality of devices of a system which is faulty, said fault response processor being operable to: generate a data model having a structure which represents said plurality of devices and the inter-connection of said devices; receive fault report data generated by device drivers following a change in the operational status of one or more of the devices, wherein said fault report data including the operational status of the device and a fault indication of whether the change in the operational status was caused internally within the device or externally by another connected device; and estimate a location of a faulty device, within said model, by applying the operational status of one or more of the devices and the fault indication corresponding to one or more of the devices to the model; wherein said fault response processor is operable to pre-process said model by comparing the operational status information from fault report data associated with successively connected devices in a data path, wherein if the operational status indicates that a preceding device on the data path is degraded or down, fault direction information is generated for the preceding device indicating that a fault is internal, and wherein if the operational status indicates that a succeeding device on the data path is down or degraded, fault direction information is generated for the succeeding device indicating that a fault is external, wherein the fault report data associated with the succeeding device is disregarded in said estimation of the location of said faulty device. 44. A fault response processor for use in estimating a location of at least one of a plurality of devices of a system which is faulty, said fault response processor being operable to: generate a data model having a structure which represents said plurality of devices and the inter-connection of said devices; receive fault report data generated by device drivers following a change in the operational status of one or more of the devices, wherein said fault report data including the operational status of the device and a fault indication of whether the change in the operational status was caused internally within the device or externally by another connected device; estimate a location of a faulty device, within said model, by applying the operational status of one or more of the devices and the fault indication corresponding to one or more of the devices to the model; generate fault probability measures for one or more monitored devices in the model, wherein each fault probability measure is representative of a perceived likelihood that the monitored device is faulty, wherein the fault probability measures being generated by applying fault direction information and the operational status information to the model; compare the fault probability measures for the monitored devices in the model with a predetermined threshold, and consequent upon the comparison, to estimate the location of the faulty device from the result of the comparison; and for each monitored device represented in said model having a plurality of fault probability measures associated with the monitored device, combine the fault probability measures for the monitored device, and compare the combined fault probability measure with said predetermined threshold to provide an estimated location of the faulty device. 45. A fault response processor as claimed in claim 44, wherein: said fault response processor is operable to identify from a time of arrival of said fault report data an analysis interval, and wherein said fault response processor is operable to estimate said location of said faulty device, a location of a faulty group of devices or a location of a faulty Field Replaceable Units from the fault report data generated within said analysis interval, and said fault response processor is operable to determine a rate of arrival of said fault report data and to define said analysis interval from at least one of a time at which said rate of arrival increases and a time at which said rate of arrival decreases. 46. A method of locating faulty devices in a system including a plurality of devices, said method comprising: monitoring an operational status of one or more of the plurality of devices; generating a model of said system, wherein the model includes a structure representing the plurality of monitored devices and the inter-connection of the monitored devices via at least one data path; generating fault report data consequent upon a change of operational status of at least one of said devices, wherein said fault report data including the operational status of the monitored device and a fault indication of whether the change of operational status of the monitored device was caused internally within the monitored device or externally by another connected device; and estimating a location of a faulty device, within said model, by applying the operational status of one or more of the monitored devices and the fault indication corresponding to one or more of the monitored devices to the model; generating fault probability measures for one or more monitored devices in the model, wherein each fault probability measure is representative of a perceived likelihood that the monitored device is faulty, wherein the fault probability measures being generated by applying fault direction information and the operational status information to the model; comparing the fault probability measures for the monitored devices in the model with a predetermined threshold, and consequent upon the comparison, to estimate the location of the faulty device from the result of the comparison; and for each monitored device represented in said model having a plurality of fault probability measures associated with the monitored device, combining the fault probability measures for the monitored device, and then comparing the combined fault probability measure with said predetermined threshold to provide an estimated location of the faulty device. 47. A method as claimed in claim 46, wherein said identifying said analysis interval comprises: determining a rate of arrival of said fault report data, and determining said analysis interval from at least one of a time at which said rate of arrival increases and a time at which said rate of arrival decreases.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.