Midnight Hack 3: Scraping LinkedIn connections using JavaScript

Sanat Dutta
3 min readMay 14, 2020

For reasons, I needed to export a list of my LinkedIn network contacts specifically in the Pharmaceutical Industry.

You can filter and view your contacts on LinkedIn but it neither lets you export the data nor has official API support for the same.

So, we’ll be using our logged-in browser session to make unofficial API calls to scrape the contacts.

I’ve heard, if you say something is for Educational Purposes, you can pretty much get away with anything. So yeah, this is for Educational Purpose :)

Let’s begin by going to the LinkedIn Contacts page and apply all the required filters.

Source: Screenshot

Now open DevTools on your browser and switch to Network Tab. Upon basic inspection, it shouldn’t be difficult to identify the API responsible for fetching contacts.

Source: Screenshot

Now you can make the same API request from the browser console and get the paged contacts without even implementing OAuth :)

Let’s make a simple REST call from the console tab and check out the response.

Upon making the request, you’ll find that the request errors out because of failed CSRF check.

Source: Screenshot

This is a counter-measure to stop unwanted API calls from logged-in clients. But since you’re already logged in and you can see the request headers for the previously made API call though LinkedIn page, you can easily copy the csrf-token and other request headers to your API request.

Source: https://i.kym-cdn.com/entries/icons/original/000/022/138/highresrollsafe.jpg
Source: Screenshot

Switch to header tab and copy csrf-token and other request headers and make the API call again.

Source: Screenshot

And …… Voilà

This only returns 10 paged results and for some reason, the API response became uncertain when I tried to increase the count by changing the request parameter so let’s just stick to 10 responses and add a loop to fetch all the contacts.

This piece of code will loop the request with increasing page count and save fetched contacts as an array. Since we’re restless and we haven’t added any delays between API requests, the requests are terminated from the server-side after 1000 or so contact fetches.

To avoid this you can add a delay between requests or re-run the same script with start index as 1000. But for now, 1000 contact will do.

Source: Screenshot

Now, let’s filter out the unwanted fields and export it as CSV.

And….. here’s the result.

Source: Screenshot

Until next time….

--

--

Sanat Dutta

An amateur photographer who uses Data Science, Machine Learning & Automation to make his life easier. See my work at instagram.com/sanatoverhere