# How to use cracking the coding interview

Usually in CIB, we mark a node as visited by setting a visited flag in its node class. Here we

don't try to do that (there may be multiple searches at the same time, so it's bad

to directly edit our data). In this case, we could mimic node marking with a hash table to

Search node ID and how to use cracking the coding interview whether it was visited or not.

Other follow-up questions:

»In real life, servers crash. How does this affect you?

»How can you take advantage of caching?

»Searching to the end of the graph (unlimited)? How do you decide when to give up?

»In real life, some people have more friends than others, so

a path between you and someone else is more likely. How you can use this data

choose where you start to cross?

Follow-up: What if we only have 10MB of memory?

A missing integer can be found in just two steps of the data set. We can part

integers in blocks of a certain size (we'll discuss how to decide a size later). Let's go mar¬

Suppose we divide the integers into blocks of 1000. So, block 0 is equal to the numbers

0 to 999, block 1 is equal to blocks 1000-1999, etc. Since the internal range is limited, we are

Know that the number of blocks required is limited.

In the first pass, we count the number of interiors in each block. That is, if we see 552, we know

that in block 0, we make an incremental counter [0]. If we see 1425, we know that it is in a block.

1, so we increase the increment].

At the end of the first pass, we can quickly see a missing block of a number. Both of them

our block size is 1000, so any block with less than 1000 numbers must be missing

number. Select any of those blocks.

In the second pass, let's really see what number is missing. We can do this through framework¬

Simple bit vector of size 1000. We go through the file, and for each number a

It should be in our block, we set the appropriate bit in the bit vector. Eventually we will know

what number (or numbers) are missing.

Now all we have to do is decide the size of the block.

The quick answer is a value of 2 A 20 per block. We will need a set with 2 block counters 12 A and

vector bit in 2 bytes 17. Both of these things can comfortably fit in 10 * 2 to 20 bytes.

What is the smallest footprint? When the same memory is in the range of block counters

as the bit vector. Lig N = 2 A 32.

First, how is the tracker looping? The answer is very simple: when we re-analyze

page already analyzed. This would mean going back to all the links found on that page, and this

it would continue to be circular.

Be careful what the interviewer considers the "same" page. Is it a URL or content? One

it could be redirected to a previously crawled page.

So how do we stop visiting a page that has already been visited? The web is a graph-based structure,

and we commonly use DFS (first deep search) and CIB (first wide search) for crosses

graphics. We can mark previously visited pages in the same way as we would in CIB / DFS.

We can easily show that this algorithm ends anyway. We know that every step

The algorithm will only analyze new pages that have not yet been visited. So if we accept that

we have an unexpected number of https://crackingthecodinginterview.co/ pages N, then we decrease N (N-1) in I. that in each step

prove that our algorithm will continue until it is only N degrees.