Tribhuvan University
Institute of Science and Technology
Bachelor Level/Fourth Year/ Seventh Semester/ Science Full Marks: 60
Computer Science and Information Technology (CSc.405) Pass Marks: 24
(Internet Technology) Time: 3 hours
Candidates are required to give their answers in their own words as far as possible.
The figures in the margin indicate full marks.
Attempt any ten questions.
- How IR in web search is different from other IR systems? Discuss IR architecture with suitable example. (2+4)
-
Assume that document space is defined by four terms: Network, CSIT, Nepal, TU and Graduate. And we have three documents containing the following terms:
- Doc1: CSIT Nepal
- Doc2: TU CSIT
- Doc3: CSIT TU Nepal
- What is meant by stop word removal? Explain text normalization with suitable example. (1+5)
-
Suppose the table given below lists all the documents retrieved by an algorithm. If total number of relevant documents is 6, calculate the value of recall, precision, and F-score. (6)
sn Doc ID relevant 1 D1 no 2 D2 no 3 D3 yes 4 D4 no 5 D5 yes 6 D6 yes 7 D7 no 8 D8 no 9 D9 yes - Why query expansion is important? Discuss query expansion techniques with examples. (1+5)
- Why Hits algorithm is used? Discuss its working with example. (2+4)
- How Bots are different from spiders? Describe simple and multithreded spidering algorithm. (1+5)
- How text categorization is different from clustering? Explain nearest neighbor categorization algorithm. (1+5)
- Differentiate collaborative filtering from content based filtering? Discuss content based recommender system with its strengths and drawbacks. (2+4)
- Why TF-IDF weighting is important in information retrieval? Explain with suitable example. (6)
- How information extraction differs from information retrieval? Discuss role of XML in information extraction. (6)
-
Write short notes on: (3+3)
- Latent Semantic Indexing
- Spiders
No comments:
Post a Comment