Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ruleset rumor reduction & Undefined behavior rule #3

Open
zi0Black opened this issue May 9, 2022 · 2 comments
Open

Ruleset rumor reduction & Undefined behavior rule #3

zi0Black opened this issue May 9, 2022 · 2 comments

Comments

@zi0Black
Copy link

zi0Black commented May 9, 2022

First of all, @0xdea, you have done great work there with the coverage, also if the support offered by semgrep is still limited.

I found that some C/CPP functions are flagged as vulnerable code in multiple rules, I will go through the one I faced and propose a fix to reduce the rumor produced by the ruleset.

Take the following call vsnprintf(NULL, 0, fmt, string);, this call has a deterministic behavior. Since an amount of 0 bytes is copied to the destination buffer, the first argument of the vsnprintf can be a NULL pointer. The length of the potentially copied data (in the destination buffer if it was big enough) is returned and can be used for other purses (e.g., allocating an appropriate buffer).
I think that the rule format-string-bugs.yaml should not match this case since it's eventually caught by the rule unsafe-ret-snprintf-vsnprintf.yaml.

Suggested change to the "format-string-bugs" rule (I can make the PR if I receive positive feedback):

- pattern-not: vsnprintf($ARG1, 0, ...)
- pattern-not: snprintf($ARG1, 0, ...)

I don't know if eventually could be interesting to have a rule for undefined behavior e.g., vsnprintf(NULL, 100, fmt, string), since this will result in writing some data to a NULL pointer, I didn't investigate it but someone can delight me regarding this 😄. More generally speaking, catch some undefined behavior could be interesting from a security perspective IMHO.

References:

  • ISO/IEC 9899:201x
    • chapter 7.21.6.12 - The vsnprintf function
    • chapter 7.21.6.5 - The snprintf function
@0xdea
Copy link
Owner

0xdea commented May 10, 2022

Hi @zi0Black,

Thank you for your kind feedback.

Indeed, the issue you report is real: there's some overlap between different rules in my ruleset, the prime example being the interesting-api-calls rule that sort of overlaps with most other rules. This is a wanted feature, because in my bug hunting workflow I'd rather get some false positives than miss a single true positive. That said, in my ruleset you have some rules that are aimed to direct the auditor's focus on "code smells" and some other rules that try to detect a specific bug or bug class. I believe this is the source of the confusion: I'll leave this issue open and I'll think about how I could solve it (perhaps by classifying the rules in different categories?).

Regarding your vsnprintf() example, I'm not 100% sure (we should check the specification and library sources when available), but I believe that the format string gets parsed even when nothing is written to the target buffer, therefore there might be a format string bug even in vsnprintf(NULL, 0, fmt, string). For this reason, I'd rather not filter out this pattern from the format-string-bugs rule. On the other hand, in this case the match by unsafe-ret-snprintf-vsnprintf would be a likely false positive, because hopefully the programmer was aware of the behavior of vsnprintf() when calculating its return value.

Your second example vsnprintf(NULL, 100, fmt, string) is definitely undefined behavior: the program would crash because of the NULL pointer dereference. I agree that some undefined behaviors can have security implications, and I'll think about implementing specific rules to catch undefined behaviors. However, it's a complex task and probably not worth the effort at least until Semgrep's support for C/C++ becomes more mature. In addition, most linters and compilers already implement many such checks.

I hope my reply makes sense! Thanks again for taking the time to test my ruleset and report this issue.

PS. I always read your posts on Shielder's blog with pleasure, so keep up the great work!

@zi0Black
Copy link
Author

zi0Black commented May 10, 2022

Hi @0xdea,
I understand and share your point of view regarding getting a false positive being better than missing a vulnerability. But probably, when you have hundreds and hundreds of results, if you don't have to double-check, it would be great.

Regarding the vsnprintf and subsequentially the snprintf, you are right (my bad) is possible to exploit the format strings (in the write way). In the following screenshot the PoC, I used this call snprintf(INIZIALIZED_POINTER,0,FMT,FIXED_POINTER_STRING) to test it.
image

For the undefined behavior rules, it's clear to me your point.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants