Automated Data Mining in Python
Introduction to automated data mining in python
In this article, we will learn how one can execute automated data mining in python. Data mining is a subset of machine learning of artificial intelligence. It basically means to get more accurate results while applying machine learning to any dataset by using more layers while doing the learning process. One of the applications of data mining in python is to search a particular keyword in the text and then give the word associated with it in the output window.
To process the above application one need to train the model developed using a machine learning algorithm by using a good amount of data. To automate this process there would be a need for API. One must first determine what you need because APIs are either REST or SOAP. The basic code to implement the same is shown here:
There are different libraries that users need to import while making a connection with the API. The very first one is DateTime which is useful in tracking the current date and time as well as information about the time when the data is fetched from the source.
The second ones are Client and suds which are mainly responsible for creating the connection with the API. Next in the line is cStringIO which is mainly used by programmers to create a text object which can be used further in the code to apply different functions to it. The purpose of BeauitfulSoup is here to generalize the code in the output window. The code for the same is shown here:
# importing the required modules import logging, time, requests, re, suds_requests from datetime import timedelta,date,datetime,tzinfo from requests.auth import HTTPBasicAuth from suds.client import Client from suds.wsse import * from suds import null from cStringIO import StringIO from bs4 import BeautifulSoup as Soup
In the next piece of code, StringIO is creating an object named log_stream() on which we are applying different functions on it. The functions called basis_config and getLogger are basically useful to configure logging by setting the complexity level as well as the format of the logged file. The code is shown below:
# creating an object for logging purpose log_stream = StringIO() logging.basicConfig(stream=log_stream, level=logging.INFO) logging.getLogger('suds.transport').setLevel(logging.DEBUG) logging.getLogger('suds.client').setLevel(logging.DEBUG) # defining the link for the client server connection WSDL_URL = 'http://213.166.38.97:8080/SRIManagementWS/services/SRIManagementSOAP?wsdl'
After that, we create a username and password for the session which we have created in the further lines of code for security purposes. Here the work of the session is to transfer the data. The code is shown below for the same:
# creating a username and password for the session user_name='username' pass_word='pass_word' # creation of the session and its suthentication session = requests.session() session.auth=(user_name, pass_word)
The function called addSecuirtyHeader checks the created username and password. If the given parameters are correct it returns the appended tokenized value and if false will show an error message. The program for the same is as follows:
# function to check username and password before data fetching def addSecurityHeader(client,user_name,pass_word): security=Security() # creating tokens for the username and password userNameToken=UsernameToken(user_name,pass_word) security.tokens.append(userNameToken) client.set_options(wsse=security) # calling the function addSecurityHeader(client,user_name,pass_word) val1 = "argument_1" val2 = "argument_2" # using the try except block in case of the error message try: client.service.GetServiceById(val1, val2) except TypeNotFound as e: print e logresults = log_stream.getvalue()
The complete program looks like this:
import logging, time, requests, re, suds_requests from datetime import timedelta,date,datetime,tzinfo from requests.auth import HTTPBasicAuth from suds.client import Client from suds.wsse import * from suds import null from cStringIO import StringIO from bs4 import BeautifulSoup as Soup # creating an object to conig logging parameters log_stream = StringIO() logging.basicConfig(stream=log_stream, level=logging.INFO) logging.getLogger('suds.transport').setLevel(logging.DEBUG) logging.getLogger('suds.client').setLevel(logging.DEBUG) WSDL_URL = 'http://213.166.38.97:8080/SRIManagementWS/services/SRIManagementSOAP?wsdl' # creating username and password for security reasons user_name='username' pass_word='password' session = requests.session() session.auth=(user_name, pass_word) # creating a security function for encryption of code def addSecurityHeader(client,user_name,pass_word): security=Security() userNameToken=UsernameToken(user_name,pass_word) security.tokens.append(userNameToken) client.set_options(wsse=security) addSecurityHeader(client,username,password) val1 = "argument_1" val2 = "argument_2" try: client.service.GetServiceById(val1, val2) except TypeNotFound as e: print e logresults = log_stream.getvalue()
The above program shows the basic libraries one needs to import into their system as well as adding the security for the encryption of data from the website. Basically, this code makes a connection between the machine with an API. After the connection of API for the purpose of storing the data, one needs the MySQL database.
Also Read: AlphaStar: Strategy game StarCraft II expertise