Spark task performance analysis method based on regression model
CSTR:
Author:
Affiliation:

(School of Computer Science and Technology, Heilongjiang University, Harbin 150080, China)

Clc Number:

TP311

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    To solve the problem of performance evaluation and improvement when the Spark tasks are performed, this paper proposes a Spark performance evaluation and an analysis method based on the heuristic algorithm and support vector machine regression model. A heuristic performance evaluation algorithm is proposed, which uses Ganglia to collect and process the consumption data of cluster resource when performing the Spark tasks. According to the k-means algorithm, the task type is determined and the evaluation index and the initial weight of the heuristic performance evaluation algorithm are determined according to the task type. The task efficiency data is collected and processed from the Spark history server, and it is regarded as the state data of the Spark run-time task along with the cluster resource consumption data. The final weight of the heuristic performance evaluation algorithm is determined according to the state data iteration process, and then the Spark Performance Evaluation Regression Model is established. A Spark performance analysis method based on support vector machine SVM regression algorithm (SVR) is proposed subsequently. This method establishes a regression model for the Spark configuration parameter and the overall performance, and then analyzes the sensitivity of the regression model to find important parameters that affect the performance of Spark. The experimental results show that the heuristic performance evaluation algorithm can quantify the performance of Spark task resource consumption and operation efficiency, and can comprehensively evaluate the overall performance of the task. The SVR-based performance analysis method can be applied to the actual analysis of Spark task effectively, which can form the initial tuning advice about the Spark mission performance.

    Reference
    Related
    Cited by
Get Citation
Related Videos

Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:October 09,2017
  • Revised:
  • Adopted:
  • Online: June 14,2018
  • Published:
Article QR Code