[미국특허]
System and method for intelligent load distribution to minimize response time for web content access
원문보기
IPC분류정보
국가/구분
United States(US) Patent
등록
국제특허분류(IPC7판)
G06F-015/173
G06F-015/16
출원번호
US-0703121
(2000-10-31)
발명자
/ 주소
Candan,Kasim Selcuk
Li,Wen Syan
출원인 / 주소
NEC Corporation
대리인 / 주소
Foley &
인용정보
피인용 횟수 :
8인용 특허 :
6
초록▼
A content delivery system having m servers, S'={S1, . . . ,Sm}, n active customers, C'={C1, . . . , C n}, and g geographic locations, G'={G1, . . . , Gg} is disclosed, wherein sdelk is a server delay of server S k, ndelj,k is a network delay observed by customers in geographic location Gj while ret
A content delivery system having m servers, S'={S1, . . . ,Sm}, n active customers, C'={C1, . . . , C n}, and g geographic locations, G'={G1, . . . , Gg} is disclosed, wherein sdelk is a server delay of server S k, ndelj,k is a network delay observed by customers in geographic location Gj while retrieving content from server S k, pj is a priority value for customer Ci, ci is a total load of customer Cii, ui,j is a fraction of requests coming to customer Ci from region G j, ai,j,k is a mapping representing a fraction of requests coming to customer Ci from region Gj that have been redirected to server Sk, and sk represents a load capacity of server Sk. Within such a system, a method for distributing server loads includes the steps of representing an average prioritized observed response time as and then generating a mapping that assigns requests from customers to a particular server while minimizing AORT. A heuristic algorithm is used to generate the mapping, wherein large ai,j,k values are assigned to small ui,j횞ci횞(sdelk+ndelj,k) values to produce a smaller overall AORT value.
대표청구항▼
What is claimed is: 1. In a content delivery system having m servers, S'={S 1, . . . ,Sm}, n active customers, C'={C1, . . . , Cn}, and g geographic locations, G'={G1, . . . ,G g}, wherein sdelk is a server delay of server Sk, ndelj,k is a network delay observed by customers in geographic location
What is claimed is: 1. In a content delivery system having m servers, S'={S 1, . . . ,Sm}, n active customers, C'={C1, . . . , Cn}, and g geographic locations, G'={G1, . . . ,G g}, wherein sdelk is a server delay of server Sk, ndelj,k is a network delay observed by customers in geographic location Gj while retrieving content from server Sk, pj is a priority value for customer Ci, ci is a total load of customer Cii, ui,j is a fraction of requests coming to customer Ci from region Gj, ai,j, k is a mapping representing a fraction of requests coming to customer Ci from region Gj that have been redirected to server Sk, and sk represents a load capacity of server Sk, a method for distributing server loads, the method comprising the steps of: representing an average prioritized observed response time as generating a mapping that assigns requests from customers to a particular server while minimizing AORT; and assigning requests to the particular server based on the generated mapping. 2. The method as recited in claim 1, wherein the assigning further including the step of assigning all requests from all customers in all regions to a particular server such that, for each 3. The method as recited in claim 1, wherein the assigning step assigns requests to a particular server while ensuring that the load capacity of each server is not exceeded such that, for each for each S kεS', 4. The method as recited in claim 1, wherein the assigning step includes assigning requests to a particular server while balancing the load of each server to within a maximum allowed deviation from a balanced state Θ such that, for all pairs of servers, Sk and 5. The method as recited in claim 4, wherein if the content delivery system should add one or more servers, or remove one or more customers, the load of each server will not be redistributed unless the maximum allowed deviation from a balanced state Θ is exceeded. 6. The method as recited in claim 1, further including the step of adding one or more customers to the content delivery system only if AORTnew≦(1+Φ)횞AORTold: wherein AORTold and AORTnew are old and new values of AORT defined as and wherein a'i,j,k is a new mapping resulting from the addition of one or more customers; and wherein Φ is an allowable change in AORT for existing clients. 7. The method as recited in claim 1, further including the step of using a linear constraint solver to generate the mapping. 8. The method as recited in claim 1, further including the step of using a non-linear constraint solver to generate the mapping. 9. The method as recited in claim 1, further including the step of using a heuristic algorithm to generate the mapping, the heuristic algorithm comprising the step of assigning large ai,j,k values to small ui,j횞ci횞(sdelk+ndelj, k) values to produce a smaller overall AORT value. 10. The method as recited in claim 9, the heuristic algorithm comprising the steps of: generating a plurality of sorted lists by sorting Ci values in increasing order of c1, sorting <Ci,G j> pairs in increasing order of ui,j, sorting Sk values in increasing order of sdelk, and sorting <G j,Sk> pairs in increasing order of ndelj,k; starting with a top-most, smallest value item in each list, identifying comparable smallest-value items from the other lists to generate a plurality of <Ci,Gj,Sk> triples equivalent to the number of sorted lists; selecting from the plurality of <Ci,Gj, Sk> triples, the <Ci,Gj,Sk> triple with the smallest ui,j횞ci횞(sdelk+ ndelj,k) value; assigning to a server Sk of the selected <C i,Gj,Sk> triple a remaining load from the < Ci,Gj> pair; and repeating the heuristic algorithm starting with generating the plurality of sorted lists, taking into account the changes in the values of the Ci values and the <Ci,Gj> pairs as a result of the previous server assignment during each iteration, until the load from all <Ci,Gj> pairs has been assigned to a server Sk; wherein if, during any iteration of the heuristic algorithm, the load capacity of the server Sk is not sufficient to handle the remaining load, the remaining load capacity of the server Sk is assigned to some of the load of the <Ci,Gj> pair, and an unassigned portion of the load from the <Ci,G j> pair is reinserted into the iterative process. 11. The method as recited in claim 10, the heuristic algorithm further including the steps of: generating a load-capacity prioritized sorted list by sorting Ci values in decreasing order of remaining load capacity s k; starting with a top-most, largest value item in the load-capacity prioritized list, identifying comparable smallest-value items from the other lists to generate a load-capacity prioritized <C i,Gj,Sk> triple; considering the load-capacity prioritized <Ci, Gj,Sk> triple in the selection of the top-most < Ci,Gj,Sk> triple with the smallest ui, j횞ci횞(sdelk+ndelj,k) value; and repeating the heuristic algorithm starting with generating the plurality of sorted lists, taking into account the changes in the values of the Ci values, the <Ci,Gj> pairs, and remaining load capacity as a result of the previous server assignment during each iteration, until the load from all <Ci, Gj> pairs has been assigned to a server Sk. 12. The method as recited in claim 10, the heuristic algorithm further including the steps of: generating a list of content-available <Ci,S k> pairs in which the content of customer Ci is stored in server Sk; and selecting the <Ci,Gj,Sk> triple with the smallest ui,j횞ci횞(sdelk+ ndelj,k) value that is also part of the list of content-available <Ci,Sk> pairs; wherein if, during any iteration of the heuristic algorithm, there is no <Ci,Gj,Sk> triple that is also part of the list of content-available <Ci,Sk> pairs, a suitable <Ci,Gj,Sk> triple with the smallest ui,j횞ci횞(sdelk+ndelj, k) value is chosen and the data of customer Ci is migrated to server Sk. 13. The method as recited in claim 12, the heuristic algorithm further including the step of generating a list of content-unavailable <Ci,Sk> pairs in increasing order of migration time penalty for which the content of customer Ci is not stored in server Sk; wherein if, during any iteration of the heuristic algorithm, there is no <Ci,Gj,Sk> triple that is also part of the list of content-available <Ci,Sk> pairs, a suitable <Ci,Gj,Sk> triple with the smallest combined ui,j횞ci횞(sdelk+ ndelj,k) value and <Ci,Sk> migration time penalty is chosen and the data of customer Ci is migrated to server Sk. 14. The method as recited in claim 1, further including the step of estimating ndelj,k for non-persistent connections using HTTP logs, the step of estimating ndelj,k for non-persistent connections using HTTP logs comprising the steps of: computing an estimated round trip delay Δrserver, client as tcon--req--rec(i+1)-tresp-- send--end(i) from information stored in the HTTP logs, where tcon--req--rec(i+1) represents a time at which a connection request message is received by the server for an (i+1)th object, and tresp--send-- end(i) represents a time at which the server stops sending an ith object; and computing the response time as where tcon--req--rec represents a time at which a connection request message is received by the server for an object, and tresp --send--end represents a time at which the server stops sending the object. 15. The method as recited in claim 1, further including the step of estimating ndelj,k for persistent connections using HTTP logs, the step of estimating ndelj,k for persistent connections using HTTP logs comprising the steps of: computing an estimated round trip delay Δrserver, client as tcon--close--rec-tresp--send --end(last), where tcon-- close--rec represents a time at which the server receives a request to close the persistent connection, and tresp--send-- end(last) represents a time at which the server stops sending a response for a last request; and computing the response time as where tcon--req--rec represents a connection request time. 16. A method for estimating a response time observed by a requesting entity in a geographic location while retrieving objects from a server through a non-persistent connection using HTTP logs, the method comprising the steps of: computing an estimated round trip delay Δrserver, client as tcon--req--rec(i+1)-tresp-- send--end(i) from information stored in the HTTP logs, where tcon--req--rec(i+1) represents a time at which a connection request message is received by the server from the requesting entity for an (i+1)th object, and tresp-- send--end(i) represents a time at which the server stops sending an ith object to the requesting entity; and computing the response time as where tcon--req--rec represents a time at which a connection request message is received by the server from the requesting entity for an object, and tresp--send-- end represents a time at which the server stops sending the object to the requesting entity. 17. A method for estimating a response time observed by a requesting entity in a geographic location while retrieving objects from a server through a persistent connection using HTTP logs, the method comprising the steps of: computing an estimated round trip delay Δrserver, client as tcon--close--rec-tresp--send --end(last), where tcon-- close--rec represents a time at which the server receives a request to close the persistent connection, and tresp--send-- end(last) represents a time at which the server stops sending a response for a last request; and computing the response time as where tcon--req--rec represents a connection request time. 18. In a content delivery network having m servers, S'={S 1, . . . ,Sm}, n active customers, C'={C1, . . . , Cn}, and g geographic locations, G'={G1, . . . ,G g}, a content delivery system for distributing server loads, the content delivery system comprising: memory for storing a server delay sdelk of server Sk, a network delay ndelj,k observed by customers in geographic location Gj while retrieving content from server S k, a priority value pj for customer Ci, a total load ci of customer Cii, a fraction of requests ui, j coming to customer Ci from region Gj, a mapping ai,j,k representing a fraction of requests coming to customer Ci from region Gj that have been redirected to server Sk, and a load capacity sk of server Sk; and a processor programmed for representing an average prioritized observed response time as generating a mapping that assigns requests from customers to a particular server while minimizing AORT and the processor further programmed for assigning requests to a particular server based on said mapping. 19. The system as recited in claim 18, the processor further programmed for assigning all requests from all customers in all regions to a particular server such that, for each CiεC', GgεG', 20. The system as recited in claim 18, the processor further programmed for assigning requests to a particular server while ensuring that the load capacity of each server is not exceeded such that, for each for each SkεS', 21. The system as recited in claim 18, the processor further programmed for assigning requests to a particular server while balancing the load of each server to within a maximum allowed deviation from a balanced state Θ such that, for all pairs of servers, Sk and Sl, 22. The system as recited in claim 21, wherein if the content delivery system should add one or more servers, or remove one or more customers, the processor is further programmed for not redistributing the load of each server unless the maximum allowed deviation from a balanced state Θ is exceeded. 23. The system as recited in claim 18, wherein the processor is programmed for allowing one or more customers to be added to the content delivery system only if AORTnew≦(1+Φ) 횞AORTold: wherein AORTold and AORTnew are old and new values of AORT defined as and wherein a'i,j,k is a new mapping resulting from the addition of one or more customers; and wherein Φ is an allowable change in AORT for existing clients. 24. The system as recited in claim 18, the processor further programmed for using a linear constraint solver to generate the mapping. 25. The system as recited in claim 18, the processor further programmed for using a non-linear constraint solver to generate the mapping. 26. The system as recited in claim 18, the processor further programmed for generating the mapping by assigning large ai,j,k values to small ui,j횞ci횞(sdelk+ndelj,k ) values to produce a smaller overall AORT value. 27. The system as recited in claim 26, the processor further programmed for: generating a plurality of sorted lists by sorting Ci values in increasing order of c1, sorting <Ci,G j> pairs in increasing order of ui,j, sorting Sk values in increasing order of sdelk, and sorting <G j,Sk> pairs in increasing order of ndelj,k; starting with a top-most, smallest value item in each list, identifying comparable smallest-value items from the other lists to generate a plurality of <Ci,Gj,Sk> triples equivalent to the number of sorted lists; selecting from the plurality of <Ci,Gj, Sk> triples, the <Ci,Gj,Sk> triple with the smallest ui,j횞ci횞(sdelk+ ndelj,k) value; assigning to a server Sk of the selected <C i,Gj,Sk> triple a remaining load from the < Ci,Gj> pair; and repeating the heuristic algorithm starting with generating the plurality of sorted lists, taking into account the changes in the values of the Ci values and the <Ci,Gj> pairs as a result of the previous server assignment during each iteration, until the load from all <Ci,Gj> pairs has been assigned to a server Sk; wherein if, during any iteration of the heuristic algorithm, the load capacity of the server Sk is not sufficient to handle the remaining load, the remaining load capacity of the server Sk is assigned to some of the load of the <Ci,Gj> pair, and an unassigned portion of the load from the <Ci,G j> pair is reinserted into the iterative process. 28. The system as recited in claim 27, the processor further programmed for: generating a load-capacity prioritized sorted list by sorting Ci values in decreasing order of remaining load capacity s k; starting with a top-most, largest value item in the load-capacity prioritized list, identifying comparable smallest-value items from the other lists to generate a load-capacity prioritized <C i,Gj,Sk> triple; considering the load-capacity prioritized <Ci, Gj,Sk> triple in the selection of the top-most < Ci,Gj,Sk> triple with the smallest ui, j횞ci횞(sdelk+ndelj,k) value; and repeating the heuristic algorithm starting with generating the plurality of sorted lists, taking into account the changes in the values of the Ci values, the <Ci,Gj> pairs, and remaining load capacity as a result of the previous server assignment during each iteration, until the load from all <Ci, Gj> pairs has been assigned to a server Sk. 29. The system as recited in claim 27, the processor further programmed for: generating a list of content-available <Ci,S k> pairs in which the content of customer Ci is stored in server Sk; and selecting the <Ci,Gj,Sk> triple with the smallest ui,j횞ci횞(sdelk+ ndelj,k) value that is also part of the list of content-available <Ci,Sk> pairs; wherein if, during any iteration of the heuristic algorithm, there is no <Ci,Gj,Sk> triple that is also part of the list of content-available <Ci,Sk> pairs, a suitable <Ci,Gj,Sk> triple with the smallest ui,j횞ci횞(sdelk+ndelj, k) value is chosen and the data of customer Ci is migrated to server Sk. 30. The system as recited in claim 29, the processor further programmed for generating a list of content-unavailable <Ci, Sk> pairs in increasing order of migration time penalty for which the content of customer Ci is not stored in server S k; wherein if, during any iteration of the heuristic algorithm, there is no <Ci,Gj,Sk> triple that is also part of the list of content-available <Ci,Sk> pairs, the processor is further programmed for selecting a suitable < Ci,Gj,Sk> triple with the smallest combined ui,j횞ci횞(sdelk+ndelj,k) value and < Ci,Sk> migration time penalty, and migrating the data of customer Ci to server Sk. 31. The system as recited in claim 18, the processor further programmed for estimating ndelj,k for non-persistent connections using HTTP logs by: computing an estimated round trip delay Δrserver, client as tcon--req-- rec(i+1)-tresp--send--end(i) from information stored in the HTTP logs, where tcon--req-- rec(i+1) represents a time at which a connection request message is received by the server for an (i+1)th object, and t resp--send--end (i) represents a time at which the server stops sending an ith object; and computing the response time as where tcon--req--rec represents a time at which a connection request message is received by the server for an object, and tresp --send--end represents a time at which the server stops sending the object. 32. The system as recited in claim 18, the processor further programmed for estimating ndelj,k for persistent connections using HTTP logs by: computing an estimated round trip delay Δrserver, client as tcon--close--rec-tresp--send --end(last), where tcon-- close--rec represents a time at which the server receives a request to close the persistent connection, and tresp--send-- end(last) represents a time at which the server stops sending a response for a last request; and computing the response time as where tcon--req--rec represents a connection request time.
Aggarwal Charu Chandra ; Malkin Peter Kenneth ; Schloss Robert Jeffrey ; Yu Philip Shi-lung, Collaborative caching of a requested object by a lower level node as a function of the caching status of the object at.
Santoro, Mario A.; Brotman, Herani; Glasser, Alan L.; Miros, James; Spatscheck, Oliver; Van der Merwe, Jacobus E., Integrated adaptive anycast for content distribution.
Dickson, Lawrence J., Method and apparatus implemented in processors for real-time scheduling and task organization based on response time order of magnitude.
Dickson, Lawrence J., Method and apparatus implemented in processors for real-time scheduling and task organization based on response time order of magnitude.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.