Finding Dead Code

Rationale

Over time when we develop code some of that code can become forgotten about and replaced either not being needed or a better solution implemented and the old inferior code remains.

This dead code serves no purpose but can be the cause or contribute to multiple issues such as but not limited to:
  • Unnecessary complexity - someone looking to change something may be looking at code they don’t need to, causing confusion on whether it’s safe to delete, update or use it as an example.
  • Performance issues - code that runs that doesn't need to is just waste.
  • Security holes - new implementations that fix a security hole may totally miss code that’s tucked away. Unused endpoints that aren't updated with new auth logic can be big cause for concern.

Aims & Purpose

Here we aim to talk over some ideas on how to highlight and safely remove dead code from both JS frontend code and NodeJS/express backends. As much as we can we will try to automate the finding of this dead code because nobody wants to do things manually.

NodeJS

The examples I’ll give here will be for NodeJS and express though the ideas can be applied in most cases to any backend technologies when writing APIs.

Lint

The quick and low hanging fruit to get rid of dead code is linting. Linters can check if a function or variable are not being used and throw up warnings to alert.


The example above shows example.js where the function foo and its argument bar are never used. These are highlighted and appropriate warning shown in the editor. This can also be automated and ran from the command line as per example below.



Linting is a good means for catching low hanging fruit. However with JavaScript linting does not highlight if a file is never imported by any other files or if code is used at all when being ran.

Tombstones

When trying to find dead code in an application using tombstones seems to be a common tried and tested method.

This works by a developer adding a "tombstone" to an area of code they believe to be dead code. This could be a specific log message or something that updates a store whenever that piece of code is ran.

After a period of time these tombstones are checked and if they have been triggered it’s safe to say the code is still in use otherwise it’s an indicator that it’s not.

The pros for this is that it will work while the code is running so overcomes the fact that JS can’t determine if a file/module is used via static analysis.

However the negatives is that it’s very manual and relies on a developer finding these areas they believe to be dead code. Also it’s only an indication of dead code. If an endpoint is never used it may just be that it is a rarely used endpoint rather than dead code. All tombstones can say for certain is which code is not dead.

Access Logs vs Available Endpoints

While tombstoning relies on a developer to guess which code is not being used we can improve this slightly by automating the guessing of dead code areas.

If every API call to your backend goes through one point you can log out at that point when it was accessed and the request URL that was used to access it. This way you can gain metrics of which endpoints are being hit and how often.

With a list of all the possible endpoints like tombstoning old or unused endpoints will be flagged because they are never called which can be determined by comparing this to how many times the access log was triggered for them.

The positives is a developer doesn't have to guess which endpoints are not used. That with the other positives of tombstoning makes this a decent way of finding dead code.

However while this may show endpoints that aren't used and lead a developer toward the right path of finding dead code this means becomes more problematic for finer grain rooting out of functions that are not used. It also has the problems of tombstoning that it’s merely an indication that the code/endpoint is not used not a guarantee.

Example - AWS Cloudwatch & Express Output Routes

Here are some examples of how a developer can search Cloudwatch for access logs to determine if an endpoint in an application is still in use. In the examples we use Cloudwatch logs and the json logs our application spits out but the same can be applied to nginx logs, apache access logs and with many other log searching alternatives such as splunk or grep for the hardcore.

This example assumes we’ve added the following middleware to our Node app that logs for every time it’s accessed the same message “access-log”, the route and method used.


Get All Access Logs
To see all the access logs we can simply search for the message "access-log".
{ $.msg = "access-log" }
Simple URL Query
To match multiple json keys/values in Cloudwatch we can use “&&”. The following matches all GET requests to “/foo/bar”.
{ $.req.method = "GET" && $.msg = "access-log" && $.req.url = "/foo/bar" }
Dynamic Segments
Many urls may have dynamic segments such as the resource Ids. We can use wildcards “*” to handle much of this. The following will match all GET requests to “/baz” where it is followed by anything else such as and Id (e.g. “/baz/1”, “/baz/2”, “/baz/3” .etc).
{ $.req.method = "GET" && $.msg = "access-log" && $.req.url = "/baz/*" }

Checking The Results

If Cloudwatch comes back with no results when searching for an endpoint this can be an indication that endpoint and the underlying code may no longer be in use.

For more/full docs on querying/searching logs in Cloudwatch see AWS Filter and Pattern Syntax.

Frontend

Finding dead code in the front end in some cases can reuse the methods described in the NodeJS section and also has other tools to help keep redundant code out of the production release.

Methods Already Mentioned

Like NodeJS for a JS frontend project you can take advantage of
  • Linting
  • Tombstones
So we’ll not go over them again here.

Minifying & Tree Shaking

Minifying is a common practice for front end codebases before deploying. This process generally via modern minifiers will remove unreachable code this removing dead code on production.



Tree shaking adds to this by using ES2015 module syntax, i.e. import and export. With this it can detect entirely unused modules and remove them from production builds. For a good guide on this checkout the webpack guides which include examples https://webpack.js.org/guides/tree-shaking/.

The cons of this is that dead code is only removed from the production build and the dev code will always contain it, leaving it at a possible source of confusion for a developer.

Chrome Dev Tools Code Coverage

Chrome 59 added the ability to check code coverage on JavaScript and CSS. This works by opening the dev tools, the coverage tab and begin recording. From there use the application in question for a little and then stop recording when done. Chrome will then output what lines of code were covered during the recording.
See https://developers.google.com/web/updates/2017/04/devtools-release-notes for the full release notes and more resources.

The drawback for this is you only get coverage for the code covered while you were interacting with the application during recording. So while lines may not have coverage it may just be the case you didn’t click the button to trigger said function/lines.

A good way to to automate this is to record this coverage during E2E automations and then review the reports. Hopefully automations cover most workflows so unused code from previously deprecated workflows become visible.

Closing Notes

If you can think of any other options that overcome the cons of those mentioned above do add a comment or something :)

Comments

Popular posts from this blog

Hello World

Lighthouse - An Overview