Tag handling

Once the text is tokenized as shown in Figure 3, search_index loops over the tokens, alternating between tags and the text in the middle. Within the context of the token loop, tag handling is primarily about watching the opening and closing of tags, in a nested fashion, and ascribing the appropriately boosted score to the text in between. Following the array in Figure 3, for example, the item in position #1, h1, will get pushed onto a an array called $tagstack. At the same time, the $score will be increased according to that tag's value in the $tags array.

0 0

Post a comment