Abstract
Currently, with the popularity of the internet, people are surrounded by a large number of unhealthy pages which have a serious impact on the physical and mental health of visitors. To protect the legitimate rights and interests of internet users from infringement and maintain the harmonious and stable development of society, a new unhealthy webpage discovery system is needed. First, this paper proposed the knowledge of unhealthy webpages and web crawlers, and then the whole system's plan and design were introduced. The test results show that the unhealthy webpage discovery system can meet the needs of users. This experiment uses a CNN algorithm to classify the text and completes the collection and classification of unhealthy information through URL acquisition and URL filtering. The experimental results show that the unhealthy webpage discovery system based on a convolutional neural network can greatly improve the accuracy of unhealthy webpage discovery and reduce the omission rate, which can meet the needs of users for unhealthy webpage discovery.