Structured Query Language Injection Penetration Test Case Generation Based on Formal Description

2015-12-20 09:13HANMingMIAOChangyun苗长云

HAN Ming (韩 明) ,MIAO Chang-yun (苗长云)

1 School of Mechanical Engineering,Tianjin Polytechnic University,Tianjin 300387,China

2 School of Electronics and Information Engineering,Tianjin Polytechnic University,Tianjin 300387,China

Introduction

Software testing has moved beyond the realm of only ascertaining whether the software can accomplish the expected functions.It should also ascertain whether the software can conduct any unexpected behaviors (e.g., the security vulnerability caused by malicious attack).So the software security testing is becoming highly valued.The penetration test[1]is a significant security testing approach for detecting software vulnerabilities.The philosophy behind penetration test is exposing vulnerabilities through testers' mock attack before attackers' real exploitation to software[2].

Software testing has been widely used as a way to help engineers develop high-qualitysystems[3].The penetration test has also attracted an increasingly interest from industry and academia.Particularly,the research on the penetration test for web applications becomes more and more imperative since the web vulnerabilities will result in serious attacking threats to web security nowadays[4].The SQL injection,for example,is a typical and widespread web vulnerability which accounts for the sensitive information leakage,bypass authentication,system hijacking and/or other serious security damages to web applications[5].Therefore it's crucial to ascertain whether web application has the SQL injection vulnerability before deployment,so the SQL injection penetration test needs intensive research attention.

Although the penetration test has many advantages in detecting software security vulnerabilities and a lot of efforts have been devoted to its research,there are still many deficiencies in the methods of its research and practice.The penetration testing for web applications,for example,suffers the problem of comparatively low testing accuracy[4-9].The large number of false positives (reported vulnerabilities that in fact do not exist)and false negatives (existing vulnerabilities that have not been reported)contained in the test results often leads to low testing accuracy.In addition, the current penetration test is still heavily security expert dependent[10],and the accuracy of a test campaign mainly relies on the expertise and diligence of security experts.

The information gathering,attack generation,and response analysis are three key basic phases of penetration testing[6].However,the current researches on SQL injection penetration testing mainly focus on the information gathering and response analysis phases.For the information gathering phase,for example,Refs.[7 -8]discussed the improved crawler technology for finding more injection points in the web applications under test and Halfond et al.[1]proposed an information gathering method based on the source code analysis(non-crawler way)to improve the input vector identification of penetration testing.For the response analysis phase,Antunes and Vieira[5]proposed a test accuracy improvement approach for detecting SQL/XPath Injection vulnerabilities by comparing the structure of the SQL/XPath commands in and out of attacks;Halfond et al.[6]also presented a new SQL injection penetration test response analysis approach by which the query was parsed before it was issued to the database to check whether it's a successful attack;if it was,then vulnerability existed as an attack had already broken through defenses.Those are the typical improvements of researches for the web penetration test.

However,most of the researches on penetration test of SQL injection pay little attention to the attack generation phases.Especially the regularity and adequacy of test cases (i.e.,attack pattern library)in this phase and their impacts on test accuracy have not been well studied yet.So the test cases generated or used in related work are mostly presented in the random enumeration way (e.g.,Refs.[8 -11]),which cannot guarantee the regularity and adequacy of test case.In fact,the test case is an important factor affecting the test accuracy.An irregular or inadequate test case set cannot fully test the software defenses mechanism and trigger (detect)certain vulnerabilities,which would cause the false negative and impair the test accuracy.Namely,the randomly enumerated test cases used in the current related work are false negative prone.

To address these problems, we propose a formal description based testing method for the SQL injection vulnerability (Fig.1).A purpose-based attack tree model of SQL injection is proposed,and then under the guidance of this model, the formal descriptions for the SQL injection vulnerability feature and SQL injection attack inputs are established.Moreover,these models according to new coverage criteria are instantiated and the executable test cases are generated.Experiments show that compared with the random enumerated test case used in other works,the test case generated by our method can detect the SQL injection vulnerability more effectively.And thus the false negative is reduced and the test accuracy is improved.

Therefore,the contribution of this paper can be measured mainly in two ways.Firstly,we initiate the research on formal description based penetration test case generation for the SQL

Fig.1 The overview of our approach

injection vulnerability;secondly,we propose the coverage criteria of penetration test case for the SQL injection vulnerability,which are the guidance and metric of penetration testing adequacy.They are two key factors for the research on attack generation phase of penetration test,but often neglected by other academic researches of penetration test improvement.

1 Purpose-Based Attack Tree Model for SQL Injection

The SQL injection is one of the most serious and widespread security vulnerabilities in the current web applications[12].An SQL injection attack takes place when a hacker changes the semantic or syntactic logic of an SQL text string by inserting SQL keywords or special symbols within the original SQL command,executed at the database layer of an application.Halfond et al.[13]summarized the main techniques for performing SQL injection attacks;they provided information and examples of how these techniques worked in their paper.

The SQL injection vulnerability can be divided into two sub-classes:first-order and second-order.The first-order SQL injection vulnerabilities result in immediate SQL command execution upon user input submission,while the second-order SQL injection requires raw user input to be loaded from the database[4].To simplify our discussion,we focus on the firstorder SQL injection vulnerability in this paper.

We first discuss the research on modeling of SQL injection attack.

The current researches on modeling of SQL injection attack generally apply the attack tree[14]to describing its regularity.For example,Wang et al.[15]proposed an augmented attack tree model for the SQL injection attacks.Their augmented attack tree described the major patterns of SQL injection attack against web applications,including classified attack steps and various types of attack input characters;Marback et al.[16]used the attack tree to describe the SQL injection attack process to a target web application,and accordingly generated the security test sequence to this web application.

There are few current researches on SQL injection attack modeling nowadays.And the models mentioned above didn't desirably reflect the logic regularity of SQL injection attacks.

The SQL injection attack tree model proposed by Wang et al.[15],for example,mainly focused on the description of SQL injection attack inputs (signature)through regular expressions.It made the model scene-specific,just described specific attack inputs in certain attack scenes,which cannot be widely applied in all attack scenarios;what's more,the injection signatures described in the form of regular expression cannot reflect the purpose of attackers' injection.The model proposed by Marback et al.[16]only abstracted the test/attack process for a particular web application and didn't reflect the regularity of attack inputs(test case inputs).

So we propose an attack purpose-based attack tree model for SQL injection,which describes the SQL injection attack from the perspective of the immediate purpose of attackers'immediate purpose.

The model we proposed in Fig.2 classifies the SQL injection attacks into three classes according to the attackers'immediate purposes: steal system information, bypass authentication,and remote command execution.

It is a comparatively comprehensive description of the current SQL injection attacks.What's more,the model we proposed doesn't describe the SQL injection attack from the perspective of attack input patterns(signature)nor specific attack steps,which is instructive to the description separation between the representing symbols and character patterns (e.g.,regular expressions)of attack inputs.So our model can describe the SQL injection attacks in panorama and have the high level of abstraction.

Fig.2 The purpose-based attack tree model for SQL injection

2 The Formal Description of SQL Injection Vulnerability

We further establish the formal description of SQL injection vulnerability features under the guidance of the attack tree we proposed.Namely,we describe various attack inputs and vulnerable responses of web application in formal language.

According to the description of attack tree mode in Fig.2,we have the following definitions.

Definition 1 WA has the SQL injection vulnerability,denoted as SQLI(WA),then

Definition 2 The steal system information vulnerability of SQL injection is denoted as I(WA),and then

Definition 3 Attackers exploit the SQL injection vulnerability through error message utilizing to get valuable information,denoted as deformSInject(WA),and then

Here the attacker.input denotes the attackers' injection(attack)inputs set to WA;the detailed definitions about the attacker.input are listed in Table 1;the attacker.GET _knowledge()denotes attackers can get useful information for attack behavior;the WA.response()denotes the responses of WA to attack inputs,and the WA.response().error denotes WA generates observable error message to attackers.

Definition 4 The blind injection vulnerability of SQL injection is denoted as blindInject(WA).This vulnerability includes two ways,the condition_inference()and the timing inference().

condition_inference(WA)↔(attacker.input i ∈AND Tautology∧attacker.input j ∈AND Contradiction ∨attacker.input i ∈AND Contradiction ∧attacker.input j ∈AND Tautology)∧WA.response(attacker.input i).state ≠WA.

Here the WA.response().state denotes the response state of WA to inputs;the WA.response().run denotes whether the injected commands are executed in WA (TRUE or FALSE).Definition 5 The remote command execution vulnerability of SQL injection means attackers can inject executable commands to WA,which are denoted as R(WA).

Here the SQLRuning ()denotes the SQL commands injection attack and the SPRuning ()denotes the non-SQL commands injection attack.

Definition 6 The bypass authentication vulnerability of SQL injection means attackers can circumvent the authentication control mechanism of WA through SQL injections,denoted as L(WA).

Here the WA.response ().authenticated denotes the attacker whether pass the authentication of WA;the usr and pwd denote the username and password submitted to WA,respectively.

The symbols related to attacker.input in Formulas (1)-(10)are defined in Table 1.

Table 1 Symbols for SQL injection attack inputs

We establish a formal description set of SQL injection.Formulas (1)-(10)clarify the external features of the SQL injection vulnerability.On one hand,we classify and formally describe the current SQL injection attack inputs,which can overcome the infiniteness and irregularity of the randomly enumerated test case (attack inputs)used in other researches(e.g.,Refs.[8,17]) and provide the guidance of penetration test case generation;on the other hand,these formal expressions are indispensible criteria of determining the existence of SQL injection vulnerability in penetration test.

3 The Modeling of SQL Injection Penetration Test Case

In the related academic researches,the test case is generally defined as a triple,t = (Pre,In,Out),Pre is the precondition,In denotes the test case inputs,and Out is the expected output.For the SQL injection penetration test case,we define Pre as the type of input vectors[6]in web applications (finding these input vectors is the precondition of SQL injection attack),define In as the attack inputs set (the attacker.input mentioned above),and define Out as the vulnerable response of the web application for the SQL injection vulnerability.

According to the definition of symbols in Table 1 and the description of Formulas (1)-(10),we assign the SQL injection penetration test case set in Table 2.The Object denotes testing(attack)purpose in Table 2.We detail the SQL injection in five purposes in penetration test.

Table 2 The formal description of SQL injection penetration test case set

In this paper,we concentrate on the research for the In and Out,so we omit the detail study of model-driven setting for the Pre.Here we just consider two kinds of typical input vectors,the URL GET parameters (e.g.,/List.asp?id =12)and the login forms in web applications[12].Hence the Pre in Table 2 can be adjusted according to testing conditions or needs.For example,add the cookies in the Pre,etc.

4 The Instantiation of Test Case Model

Table 2 describes the regularity of the current SQL injection attack,accordingly the formal description of In and Out in it which can reveal what test case should be used in the SQL injection penetration test.Nevertheless the testers also should be advised to the number of test cases that should be used.The former is the issue of test case modeling;and the latter is the issue of model instantiation.The test case model instantiation means the formal description of test case should be translated to executable test cases in the context of fingerprint of web application and coverage criteria.

The instantiation of test case input In needs the adequacy criterion.To test whether the web application defense mechanism can block various patterns of SQL injection attack inputs,the definition principle of coverage criteria is to make test case inputs cover more attack input patterns.This principle differs from the program-based criteria (white box,internal structure-based),or specification-based (black box,functionbased)criteria proposed in other related researches,so we propose some new coverage criteria based on the equivalence partitioning of input domain (partition testing) to guide generating penetration test case inputs.

Definition 7 The Command Verbs Coverage Criterion:

where MV denotes the executable command verbs set that should be used in the test for a web application,the TC denotes the test case inputs set,and the <tc,mv >means the command verb and mv is used in the construction of test case input tc,and no other command verbs used (contained)in tc.

Definition 8 The relation predicate coverage criterion:

where OP denotes the SQL syntax relation predicates set that should be used in test;the <tc,op >denotes the predicate op used in the construction of test case input tc,and no other predicates used (contained)in tc.

The random coverage ratio (RCR)is a method of selecting finite test cases from the infinite number of illegal values set[18].The RCR selects the illegal values arbitrarily from the available set of illegal inputs and the selected values can be regarded as the representative for the whole illegal values set.We choose the RCR for the instantiation of the Deformed characters set,generating several random illegal characters/strings as the instantiated case.For the instantiation of the stored procedures set that contains large number of command verbs,we also used the RCR method to randomly select a certain amount of stored procedures command verbs as the elements of the MV set(Definition 7)and then apply the command verbs coverage criterion to generating test case.The number of randomly selected test cases is determined according to the test scale.

The setting for test case instantiation in our study is listed in Tables 3-4.

Table 3 Setting for test case inputs instantiation

The instantiation of test case output Out mainly depends on the fingerprint of web application.The fingerprint of web application includes the database type,the version and type of a running web server,etc.

Table 4 Setting for test case outputs instantiation

5 Evaluation

Here,we describe the experiments design (subject web applications and seeded SQL vulnerability,etc.),measurement metrics and results.

5.1 Experiment subjects and SQL injection vulnerability seeded in them

In order to verify our test case generation methods,we create two web applications as the testing experiment subject,a JSP and an ASP website.They applied IIS 5.1 environment and HTML code technique.Their back-end database is SQL server 2000 SP4.The JSP subject has around 5 500 lines of code and the ASP subject has around 15 000 lines of code.These two subject web applications have the login authentication module,client management module,database connectivity module,etc.Their functions and structures imitate the traits of real common web applications,so the penetration test for them can be regarded as the representative for actual web application testing.What's more,through testing in our owned subject web applications,we can evaluate the performance of different test case in a controlled environment (known SQL injection vulnerability and inadequate defense mechanism).

Fong et al.[19]proposed a testing tool evaluation method based on the levels of defense of web applications.Learning from that approach,we set two levels of defense against the SQL injection attack (Table 5)in the channels through which the two subject web applications access their back-end databases.

Table 5 The defense level of subject web applications

The channels adopting Level 0 have the SQL injection vulnerability caused by completely no defense to users' input;while the channels adopting Level 1 have the SQL injection vulnerability caused by the inadequate defense mechanism (the keywords in the blacklist filter of Level 1 are not sufficient,and thus some attack inputs can escape the filter).

5.2 Automatic penetration test tools and instantiated test case

We create an automated web application SQL injection vulnerability penetration test tool (Model Based Testing(MBT)scan)to implement the test,and it applies the widely used crawling-attack-analysis method in Refs.[4,9]to detecting the SQL injection vulnerability seeded in subject web applications.Its crawler module traverses the web applications to access all the reachable pages and parse out the input vectors(the URL GET parameters and login forms here)contained in these pages.And its attack module submits instantiated test case inputs to corresponding input vectors found by crawler module,and then its analyzing module analyzes the application's response to judge whether a vulnerability has been trigged.

We instantiate the formal description in Table 6 to generate real test case.

Table 6 The example of instantiated test case for subject web applications

Based on the coverage criteria in Definitions 7-8 and the setting in Table 3,we artificially generate approximate 50 test case inputs for the attack module of MBT scan.

In many related works[2,6,20], the commercial web penetration test tools are used for comparing different test methods.So we choose two famous test tools,the Acunetix 6.5 and IBM Rational AppScan 7.7 for comparison (only use their testing function for SQL injection vulnerability).We refer them as the Tool A and Tool B here (with no particular order)to avoid the commercial brand comparison.The Tool A and Tool B also apply the crawling-attack-analysis testing method.They can be regarded as the representatives of the random enumerated test cases method commonly used in other researches.So we compare our method with them to show the superior performance over existing related approaches in terms of test case.

5.3 Measurement metrics

Some related works used the number of detected vulnerable input vectors as the metric of testing effect.However,this number doesn't have the absolute sense to assess the test effect.Multiple input vectors may correspond to one inner channel that the web communicates with its back-end database (e.g.,several URL GET parameters may correspond to one SQL command in web application code,a database communicating channel).In the situation,we cannot say the more the number of detected vulnerable input vectors the better,because a singerline code-fix would make many detected vulnerable input vectors safe[21].The tool that detects more input vectors is likely just finding fewer vulnerable communicating channels than others.Just as shown in Tables 7 and 8,if a tool can detect all the vulnerable input vectors corresponding to the channel ⑨and ○1,then number of its detected vulnerability may be counted as 94;however if another tool just detects fewer vulnerable input vectors and covers more channels,then the latter is more useful to the location and code-fix for mending vulnerability than the former.

So we use the number of detected vulnerable communicating channels as the test effect evaluation metric.Namely,we classify the vulnerable input vectors detected by each tool according to the vulnerable channels that input vectors correspond to.That is,a vulnerable channel is detected if one of its corresponding vulnerable input vectors can be detected.

Table 7 SQL injection vulnerability seeded in the JSP website

Table 8 SQL Injection vulnerability seeded in the ASP website

In Tables 7 and 8,each vulnerable page represents a vulnerable channel.A series of URL GET parameters,for example,can correspond to one SQL command issued in source code,such as the URL,/hzp/sub.asp?id=1,2,…,the id=X is the input vector and the /hzp/sub.asp is the communicating channel.For the login forms,the page in which login forms located is a communicating channel and the login forms are the input vectors.Of course,the number of input vectors is always two,the forms for username and password.

5.4 Experiment results and discussion

We use above instantiated test case and penetration test tools to test the two subject applications.In the SQL injection penetration test to the subject web applications,three tools all generate no false positive.The vulnerable channels that can be detected by each tool and the total testing time are listed in Table 9.

The total testing time includes the execution time of the whole crawling-attack-analysis testing procedure of each tool and doesn't include the time of artificially instantiating test case for our MBT scan.

Table 9 The testing result and testing time of each method

We don't use the crawling-challenge[9]techniques in our two subject web applications (e.g.,JavaScript,Flash),so the three tools all can automatically find the pages ①-○1 and their associated input vectors.Therefore, the test result is independent of their input vectors finding ability.The testing experimental result shows that for the vulnerable channels with no defense(Level 0),three tools all can effectively detect these vulnerability (①②⑥⑦);while for the vulnerable channels with inadequate defense (Level 1),the Tool A and Tool B generated false negatives,some vulnerable channels cannot be identified,whereas our MBT scan can detect these SQL injection vulnerabilities hidden behind the inadequate defense(Level 1).

The main reason lies in that the test case inputs of MBT scan is more effective than the Tool A and B.On one hand,the test cases of MBT scan are generated under the guidance of the test case formal description(Tables 2 and 4),which can reflect different kinds of attack methods.Whereas the test cases of Tool A and Tool B are mainly the random enumeration of known attack inputs,so they are difficult to ensure the full consideration for various kinds of attack methods;on the other hand,our test cases are generated under the coverage criteria in Definitions 7-8,which makes the generated attack inputs cover more patterns (such as more style conditionals, various command verbs)than simple test case used in Tool A and Tool B.So they are more capable of breaking through the inadequate blacklist filter defense (Level 1)and finding the hidden SQL injection vulnerability.The test result showed that the simple test case inputs of Tool A and Tool B were blocked by the inadequate blacklist defense (Level 1),so they concluded the testee input vector was protected and not vulnerable.Therefore,the simple test case prevented them from finding some vulnerable channels hidden behind the inadequate defense.

Table 10 showed our test case set didn't cause excessive time consumption compared with Tools A and B,which confirmed its feasibility.

Table 10 The coverage assessing for the test case used in three tools

6 Conclusions

This study proposes a formal description based penetration test method for SQL injection vulnerability.This method mainly addresses the problem of false negative proneness of the random enumerated SQL injection penetration test case.Our research demonstrates that the proper formal modeling and coverage criteria are helpful for revealing the regularity of SQL injection attack and generating more effective penetration test case.Experiment shows that compared with randomly enumerated test case,the test case generated by our method can detect the SQL injection vulnerabilities hidden behind inadequate blacklist defense more thoroughly,and thus reduce the false negative of penetration testing.

The future researches based on our study conceivably include,integrating this study with the researches on other key phases of penetration test to improve test accuracy,optimizing the proposed formal descriptions and coverage criteria,etc.

[1]Halfond W G J,Choudhary S R,Orso A.Penetration Testing with Improved Input Vector Identification[C].Proceedings of the 2nd International Conference on Software Testing,Verification,and Validation,Denver,CO,USA,2009:346-355.

[2]Antunes J,Neves N,Correia M,et al.Vulnerability Discovery with Attack Injection[J].IEEE Transactions on Software Engineering,2010,36(3):357-370.

[3]Roongruangsuwan S,Daengdej J.A Test Case Prioritization Method with Practical Weight Factors[J].Journal of Software Engineering,2010,4(3):193-214.

[4]Bau J,Bursztein E,Gupta D,et al.State of the Art:Automated Black-Box Web Application Vulnerability Testing [C].Proceedings of the 2010 IEEE Symposium on Security and Privacy,Berkeley/Oakland,CA,USA,2010:332-345.

[5]Antunes N,Vieira M.Detecting SQL Injection Vulnerabilities in Web Services [C].Proceedings of the 4th Latin-American Symposium on Dependable Computing,Joao Pessoa,Brazil,2009:17-24.

[6]Halfond W G J,Choudhary S R,Orso A.Improving Penetration Testing Through Static and Dynamic Analysis [J].Software Testing Verification and Reliability,2011,21(3):195-214.

[7]McAllister S,Kirda E,Kruegel C.Leveraging User Interactions for In-Depth Testing of Web Applications[C].Proceedings of the 11th International Symposium on Recent Advances in Intrusion Detection,Cambridge,MA,United States,2008:191-210.

[8]Huang Y W,Tsai C H,Lin T P,et al.A Testing Framework for Web Application Security Assessment[J].Computer Networks,2005,48(5):739-761.

[9]Doupé A,Cova M,Vigna G.Why Johnny Can't Pentest:an Analysis of Black-Box Web Vulnerability Scanners [C].Proceedings of the 7th GI International Conference on Detection of Intrusions and Malware and Vulnerability Assessment,Bonn,Germany,2010:111-131.

[10]Xiong P,Peyton L.A Model-Driven Penetration Test Framework for Web Applications[C].Proceedings of the 2010 8th Annual International Conference on Privacy Security and Trust,Ottawa,ON,Canada,2010:173-180.

[11]Antunes N,Laranjeiro N,Vieira M,et al.Effective Detection of SQL/XPath Injection Vulnerabilities in Web Services [C].Proceedings of the IEEE International Conference on Services Computing,Bangalore,India,2009:260-267.

[12]OWASP.OWASP Top-10 2010.OWASP_Top_Ten_Project[EB/OL].(2010-12-11)[2014-02-11].www.owasp.org/index.php/Category.

[13]Halfond W G J,Orso A,Manolios P.WASP:Protecting Web Applications Using Positive Tainting and Syntax-Aware Evaluation[J].IEEE Transactions on Software Engineering,2008,34(1):65-81.

[14]Schneier B.Attack Trees[J].Dr.Dobb's Journal,1999,24(12):21-29.

[15]Wang J,Phan R C,Whitley J N,et al.Augmented Attack Tree Modeling of SQL Injection Attacks[C].Proceedings of the 2nd IEEE International Conference on Information Management and Engineering,Chengdu,China,2010:182-186.

[16]Marback A,Do H,He K,et al.Security Test Generation using Threat Trees [C].Proceedings of the ICSE Workshop on Automation of Software Test (AST '09),Vancouver,BC,Canada,2009:62-69.

[17]Fonseca J,Vieira M,Madeira H.Testing and Comparing Web Vulnerability Scanning Tools for SQL Injection and XSS Attacks[C].Proceedings of the 13th Pacific Rim International Symposium on Dependable Computing (PRDC 2007),Melbourne,VIC,Australia,2007:365-372.

[18]Antunes N,Vieira M.Benchmarking Vulnerability Detection Tools for Web Services [C].Proceedings of the 2010 IEEE International Conference on Web Services,Miami,FL,USA,2010:203-210.

[19]Fong E,Gaucher R,Okun V,et al.Building a Test Suite for Web Application Scanners[C].Proceedings of the Annual Hawaii International Conference on System Sciences,Big Island,HI,USA,2008:478.

[20]Li N,Xie T,Jin M Z,et al.Perturbation-Based User-Input-Validation Testing of Web Applications[J].Journal of Systems and Software,2010,83(10):2263-2274.

[21]Kiezun A,Guo P J,Jayaraman K,et al.[C].Proceedings of the International Conference on Software Engineering,Vancouver,BC,Canada,2009:199-209.