Overview

The motivation, here, is to extend the functionality of vulture as a library, and to pass on all metadata through the API and then to harness this utility in VultureBear for auto removing dead code, which would greatly optimise the bear. The second part of this project focuses on offering the source range of the dead code which would make auto-removal much easier. As of now, vulture only supplies the beginning of the dead code. Also, it proposes to enhance vulture in order to detect unreachable code (like if False, if True else, any code written after return statements, etc). - this shall help the user in trimming down their codebase without affecting usability. Also, the third part would be to implement a confidence value for every result, this shall be helpful when tackling false positives.

coala-VultureBear-Integration-picture

Goals

Specifications

1.) Realise vulture’s API in VultureBear

2.) Making whitelist default and extending it further

The first step here would be to make the whitelist default. The important thing would be to identify possible cases which might cause vulture to report a false positive. This can be achieved through extensive testing with major projects - trending python projects on github would cater to our need for the purpose. This approach would serve us many benefits:

3.) Acquiring source range and implementing auto-removal

Analyse and discuss with the community the utilities of ast or enhanced pyflake ast for what would better cater to our problem and would offer simplicity for source-range acquisition and arrive at a concrete conclusion. Also, there was another proposal by @m0hawk to get everything until the next node starts. Dialogue here - #25

Also, if able to fetch the source range successfully, implement the pathway through which the metadata flows in and out of API, this would not require much work because we can easily change item.lineno (int) to item.dead_range (tuple of ints) and can parse them over in the VultureBear.

4.) Detecting unreachable code

We would first need to identify cases where code cannot be reached. Some of the common ones are:

Similar constructs would have to be looked onto. The crude form of this would be:

5.) Implementing a confidence value for results

We would need to analyse every construct individually on a case by case basis. For example, we already know that import statements can be predicted with 100% surety (except for * imports, where it would be 0%), but functions often have false-positives.

The confidence value will be alike the ones given below: (The finer grained distinctions will need further discussion)

Construct Confidence Value
import 100%
from foo import * 0%
variable <100%
function <100%
class <100%
if False 100%

References:

Milestones

PREPARATION/BONDING

CODING PHASE 1

CODING PHASE 2

CODING PHASE 3

Thank You for reading along, please feel free to tweet to @rahul722j or reach out to me at [email protected], or just comment below for any queries.