Security Coding Practices
The Java platform and its third-party libraries (e.g., BouncyCastle) provide useful features to support secure coding.
Developers often use the APIs defined in these libraries to efficiently build security functionalities.
However, misusing these libraries and frameworks not only slows down code development but also leads to security vulnerabilities.
We conducted an empirical study on StackOverflow post to understand developers’ concerns on Java secure coding, their programming obstacles, and insecure coding practices. We crawled security-related discussion threads based on keywords "Java" and "security", and manually inspected 503 discussion threads.
Our study revealed the following interesting findings:
While the above-mentioned study reveals the significant gap between security theory and coding practices, it is still unclear how seriously developers are misled by insecure coding practices suggested on StackOverflow (SO).
- There are security vulnerabilities in the recommended code of some accepted answers. For instance, to resolve the errors developers obtained while they implement Spring Security authentication, developers were suggested to work around the error by disabling the default security protection against Cross-Site Request Forgery (CSRF) attacks.
- Various programming challenges were related to security library usage.
For instance, when developers used cryptography APIs, they became stuck due to clueless error messages, complex cross-language data handling (e.g., encryption in Python and decryption in Java), and delicate implicit API usage constraints.
- Since 2012, developers have increasingly relied on the Spring Security for secure coding. Over 50% of the inspected discussions are about the Spring Security framework, while the corresponding security and usability research related to this framework is still missing.
We were curious whether insecure coding suggestions popular exist on SO; if so, whether developers can rely on the community's dynamics to choose secure suggestions over insecure ones. Therefore, we conducted a second empirical study. We crawled SO answer posts with code suggestions, and then leveraged Java Baker to extract any security-related implementation. We further applied clone detection to the extracted code data for sampling. Next, we manually inspected the sampled data to decide whether each snippet is implemented in a secure or insecure way. We made our decisions based on the security API misuse patterns revealed by other researchers. We observed the following alarming phenomena:
- As with secure answers, insecure answers are prevalent on SO across the entire studied time frame.
45% of the examined answer posts are insecure.
- The community dynamics and SO’s reputation mechanisms are not reliable indicators for secure and insecure answers. Compared with secure posts insecure ones obtained higher scores, more comments, more favorites, and more views. 45% of the examined accepted answers are insecure.
- The degree of duplication among insecure answers is significantly higher than that of secure ones. It means that users cannot assume a popularly suggested snippet to be secure.
Users seem to post duplicated answers, while ignoring security as a key property. They provided duplicated answers due to duplicated questions or users' intent to answer more questions by reusing code examples. This behavior is incentivized by the reputation system on SO.