Data Warehouse And Its Applications In Agriculture

 

DATA WAREHOUSE AND ITS APPLICATIONS IN AGRICULTURE

K.P.Wagh                                                                               Dr. Satish R. Kolhe                

Assistant Professor                                                                 Reader                                    

Gf’s GCOE Jalgaon                                                                NMU Jalgaon                         

Kishorwagh2000@yahoo.com                                               srkolhe2000@gmail.com       

 

A Data warehouse is a repository of integrated information, available for queries and analysis.  Data and information are extracted from heterogeneous sources as they are generated.  This makes it much easier and more efficient to run queries over data that originally came from different sources. In other words Data warehouse is a database that is used to hold data for reporting and analysis. 

 Economic foundation and productivity growth depends on agricultural sectors. Agriculture is the driving force behind the way of live and source of earnings for the majority of peoples. More than 60 percents of population are living in rural areas and the majority are farmers. The rural communities as a main producer for country food productivity and food security earn only 11 percents of Gross Domestic Product (GDP). The arrival of information age guides this country to new development strategies.

National Electronics and Computer Technology Center (NECTEC) in collaboration with the Ministry of Agriculture, has launched “Agriculture Information Network” as a response to the unmet information requirements of the agricultural sector. Farmers should gain benefit from the contents provided which include risk assessment, agriculture warning system and agricultural knowledge base, which aim to improve technology, productivity, income and stability of India agriculture sector through the age of Information Technology. The data warehouse consists of common databases and geo-spatial databases from various departments and organizations in the country and abroad. Farmers can get access to the contents through Internet by themselves or from groups of professional people called “Information Brokers”.

 

Keywords: Data Warehouse, Agriculture, IT

 

 

1.    Introduction

A  Data  warehouse [1] is  a  repository  of  integrated  information,  available  for  queries  and analysis.  Data  and  information  are  extracted  from  heterogeneous  sources  as  they  are generated.  This  makes  it  much  easier  and  more  efficient  to  run  queries  over  data  that originally came from  different  sources.  In other words Data warehouse is a database that is used to hold data for reporting and analysis. 

  

Goals of Data Warehousing

  • To facilitate reporting as well as analysis
  • Maintain an organizations historical information
  • Be an adaptive and resilient source of information
  • Be the foundation for decision making

  

Data Warehouse Architecture

Data warehouse Architecture comprises of

  • Operational source systems
  • A data staging area
  • One or more conformed data marts
  • A data warehouse database

 

Operational Source Systems

Operational  source  systems [1]  are  developed  to  capture  and  process  original  business transactions.  These  systems  are  designed  for  data  entry,  not  for  reporting,  but  it  is  from here the data in data warehouse gets populated.

 

Data Staging Area

Data staging area  is where  the  raw operational  data is  extracted,  cleaned,  transformed and combined  so  that  it  can  be  reported  on  and  queried  by  users.  This area lies between the operational source systems and the user database and is typically not accessible to users.

 

Data staging is a major process that includes the following sub procedures:

  • Extraction

The extract  step  is  the  first  step  of  getting  data  into the  data  warehouse  environment. Extracting means reading and understanding the  source data,  and  copying  the pas  that are needed to the data staging for further work.

  • Transformation

Once  the  data  is  extracted  into  the  data  staging  area,  there  are  many  transformation steps, including

 

1.  Cleaning the data by correcting misspellings, resolving domain conflicts, dealing with         missing data elements, and parsing into standard formats.

2.  Purging selected fields from the legacy data that are not useful for data warehouse.

3.  Combining  data  sources  by  matching  exactly  on  key  values  or  by  performing  fuzzy    matches on non-key  attributes.

4.  Creating  surrogate  keys  for  each  dimension  record  in  order  to  avoid  dependency  on legacy  defined  keys,  where  the  surrogate  key  generation  process  enforces  referential integrity between the dimension tables and fact tables.

5.  Building the aggregates for boosting the performance of common queries.

  • Loading and indexing

At  the  end  of  transformation  process,  the  data  is  in  the  form  of  load  record  images. Loading  in  the  data  warehouse  environment  usually  takes  the  form  of  replicating  the dimensional  tables  and  fact  tables  and  presenting  these  tables  to  bulk  loading facilitates each  recipient  data mart.  Bulk  loading  is a very important  capability  that  is to  be  contrasted  with  record-at-a  time  loading,  which  is  far  slower.  The target data mart must then index the newly arrived data for query performance.

 

Data Mart

Data  mart  is  a  logical  subset  of  an  enterprise-wide  data  warehouse.  For example, a data warehouse for a retail chain is constructed incrementally from individual, conformed data marts dealing with separate subject areas such as product sales. Dimensional  data  marts  are  organized  by  subject  area  such  as  sales,  finance,  and  marketing  and  coordinated  by  data  category  such  as  customer,  product,  and  location. These  flexible  information  stores  allows  data  structures  to  respond  to  business  changes-product  line  additions,  new  staff  responsibilities,  mergers,  consolidations,  and acquisitions.

  

Data Warehouse Database

A data  warehouse database  contains  the  data  that  is  organized  and  stored  specifically  for direct  user  queries  and  reports.  It  differs  from  an  OLTP  database  in  the  sense  that  it  is

designed primarily for reads not writes. An  OLAP  application  is  a  system  designed  for  few  but  complex  (read  only)  request.  An OLTP  application  is  a  system  designed  for  many  but  simple  concurrent  (and  updating) requests.

 

Metadata

Metadata defines the content and location of the data in the data warehouse, relationships between the operational databases and the data warehouse and the business views of the data in the data in the warehouse as accessible to the end-user tools. Metadata is searched by user to find the subject areas and the definitions of the data.

For decision support, the pointers required to data warehouse are provided by the metadata. Therefore, it acts as logical link between the decision support system application and the data warehouse. Thus, any data warehouse design should assure that there is a mechanism that populates and maintains the metadata repository and that all access paths to data warehouse have metadata as an entry point. In other words there should be no direct access permitted to the data-warehouse data if it does the user metadata definitions to gain the access. Meta data definition can be done by the user in any given data warehousing environment. The software environment as decided by the software tools used will provide a facility for metadata definition in a metadata repository.

 

OLAP Vs OLTP

 

OLTP (Online Transactional Processing)

  • OLTP servers handle mission-critical production data accessed through simple queries
  • Usually handles queries of an automated nature
  • OLTP applications consist of a large number of relatively simple transactions.
  • Most often contains data organised on the basis of logical relations between normalised tables

• OLAP (Online Analytical Processing)

  • OLAP servers handle management-critical data accessed through an iterative analytical investigation
  • Usually handles queries of an ad-hoc nature
  • supports more complex and demanding transactions
  • contains logically organised data in multiple dimensions

 

2.    Warehouse Schema Design

Dimensional modeling is a term used to refer a set of data modeling techniques that have

gained popularity  and acceptance for  data  warehouse  implementation.  Dimensional modeling is one of the key techniques in data warehousing.  Two types of tables are used in dimensional modeling: Fact tables and dimensional tables

 

 

 

Fact Tables  

These are used to record actual facts and measures in the business.  Facts are numeric data items that are of interest to the business.  Example, telecommunication- length of call in minutes, average number of calls.

 

Dimensional Tables 

Dimensional tables establish the context of the facts.  Dimensional tables store fields that describe the facts.  Example, telecommunication- call origin, call destination.  A schema is a fact table plus its related dimensional table.

  

3. Crucial Decision in Designing a Data Warehouse

The job of designing and implementing a data warehouse [3] is very challenging and difficult one, even though at the same time, there is a lot of focus and importance attached to it. The designer of the data warehouse may be asked by the top management:”take all enterprise data and build a data warehouse such that the management can get answer to all their questions”. This is daunting task with responsibility being visible and exciting. But how to get started? Where to start? Which data should be put first? Where is that data available? Which query should be answered? How would bring down the scope of project to something smaller and manageable, yet be scalable to gradually upgrade to upgrade to comprehensive data warehouse environment finally?

The recent trend is to build data marts for before a real large data warehouse is built. People want something smaller, so as to get manageable results before proceeding to the real data warehouse.

RALPH KIMBALL identified a nine step method as follows:

Step 1: Choose the subject matter.

Step 2: Decide the what the fact table represents.

Step 3: Identify and confirm the dimension.

Step 4: Choose the facts.

Step 5: Store precalculation in the fact table.

Step 6: Define the dimension and tables.

Step 7: Define the duration of database and periodicity of updation.

Step 8: Track slowly the change in dimension.

Step 9: Decide the query priorities and query modes.

All the above steps are required before the data warehousing is implemented. The final step or step 10 is implemented a simple data warehouse or data mart. The approach should be ‘from simpler to complex’. First only a few data marts are identified, designed and implemented. A data warehouse then will emerge gradually.

Let us discuss the above mentioned steps in detail. Interaction with the user is essential for obtaining answers to many questions. The user to be interviewed includes top management, middle management, executives, as also operational users, in addition to sales force and marketing teams. A clear picture emerges from the entire project on the data warehousing as to what are their problems and how they can be possibly solved with the help of data warehousing.

4.  Various Technology Considerations

The following or technological issues [3] are required to be considered for designing and implementing a data warehouse:

1. The hardware platforms for Data Warehouse

2. DBMS for supporting data warehouse

3. Communication and network infrastructure for a Data Warehouse

4. The system management /operating system platforms

5. The software tools for building, operating and using Data Warehouse

  

Hardware Platform

Organization normally tend to utilize the already existing hardware platform for data warehouse development however the disk storage requirements for a data warehouse will be significantly large, especially in comparison with single application.

If data warehouse or data mart is small in data size, normal Pentium server will be probably sufficient with not very high reliability standards. However for a regular large data warehouse application the server has to be specialized for the tasks associated with a data warehouse. A mainframe, for example is well suited for this purpose, as a data warehouse server. What are the features required for a successful data warehouse server? Firstly it should be able to support large data volume and complex query processing. In addition, it has to be highly scalable. As the user population keeps on growing, the network traffic and the access traffic increase significantly. Therefore, the requirement of data warehouse server is the scalable high performance for data loading and ad hoc query processing as well as the ability to support large database in a reliable and efficient manner. If the querying is going to be on a large public data network then multiprocessor configuration will be required for parallel query processing. In case of a complex server of configuration with multiple processors and large I/O bandwidth a proper balance needs to be made between I/O and processing power.

 

DBMS Selection

Next to hardware solutions a factor most critical is the database selection. This determines the speed performance of the data warehousing environment. The requirement Of a DBMS for data warehousing and requirement are scalability and high volume storage and processing and throughput in traffic.

            The majority of established RDBMS vendors have implemented various degree of parallelism in their products. Even though all the vendors have implemented various degrees of parallelism in their products. Even though all the well known vendors-IBM, ORACLE SYBASE-support parallel database processing, some of them have improved their architectures so as to better suit the specialized requirement of the data warehouse. The RDBMS products provide additional modules for OLAP cubes. The correct choice of OLAP server DB server and web server can be made by the designer or user of Data warehouse depending on the requirement.

 

Communication and Networking Infrastructure

Data warehouse can be internet enabled or intranet enabled as the choice may be. If web enabled the networking is taken care by the internet. If only Intranet based then the appropriate LAN operational environment should be accessible to all the identified users. Thus network expansion may be required as per the needs. In web enabled data warehouses, issues of security privacy and accessibility need to be considered carefully .Accordingly web enablement facilities should be ensured in the software tools used for data warehouse development.

  

Stages in Implementation

A data warehouse cannot be purchased and installed. Its implementation requires the integration of implementation of many products. Following are the steps of the Data Warehouse implementation:

Step 1: Collect and analyze business requirement.

Step2: Create a data model and physical design and data warehouse after deciding the                

            appropriate hardware platform.

Step 3: Define the data sources

Step 4: Choose the DBMS and software platform for data warehouse.

Step 5: Extract the data from operational data sources, translate it, clean-up and load into the      

            data warehouse model or data mart.

Step 6: Choose database access and reporting tools.

Step 7: Choose database connectivity software.

Step 8: Choose the data analysis (OLAP) and presentation (client GUI) software.

Step 9: Keep refreshing the data warehouse periodically.

  

Access Tools

With the exception of SAS(of SAS institute), all the Data Warehouses /OLAP vendors are not currently providing comprehensive single-window software tools capable of handling all aspects of data warehousing project implementation .SAS alone meets the requirement largely independently as it has its own database internally with a capability of import data from any vendor DBMS software. Therefore one can implement a data warehousing and data mining solution independently with SAS.

The best way to choose a group of tools is to understand the capability and compatibility of different type of access to the data and reporting by selecting best tool in market for that kind of access. The types of access and reporting are as follows:

  • 1. Time series analysis
  • 2. Data visualization, graphing, charting and pivoting
  • 3. Complex textual search (text mining)
  • 4. General stastical analysis.
  • 5. Artificial intelligence techniques for hypothesis testing, trends discovery, identification and validation of data clusters and segments(also useful for data mining)
  • 6. Mapping of specifial information into geographic information system
  • 7. Ad hoc user-specific queries
  • 8. Predefined repeatable queries
  • 9. Drilling down interactically
  • 10. Reporting the analysis by drilling down
  • 11. Complex queries with multi-table forces, multilevel sub-queries, sophisticated search criteria.

In some applications, the user requirement may exceed the capability of tools. A number of query tools are available in the market today which enables an ordinary user to build customized reports by easily composing and executing ad hoc queries without any necessity to have the knowledge of the underlying design details or data base technology, SQL, or even the data model

  

5.  Its Applications in Agriculture

  

Project: Agriculture Information System Network (AGRISNET)

Department of Agriculture and Cooperation (DAC) [2] have taken steps to establish “Agricultural Information System Network (AGRISNET)” in collaboration with NIC. The Proposal recommends (i) the state-of-the-art IT infrastructure requirements to establish AGRISNET as the INTRANET over NICNET, (ii) development of databases and information systems for decision support for evaluation, monitoring and policy formulations, and (iii) human resources development, (iv) multi-media based training and demonstration of transfer of technology to strengthen Farm Research and Education using broadcast VSATs, (v) special interest groups in respect of subjects, problems, programmes, schemes, etc, and above all, to make Indian Agriculture on-line for INTERNET and INTRANET access through AGRISNET Nodes. AGRISNET Nodes are envisaged to be established at 

  • DAC Hqrs (Krishi Bhawan), 

  • DAC Attached Offices and its regional offices, 

  • DAC Subordinate Offices and its regional units, 

  • DAC Public Sector Undertakings (NSC&SFCI) and sub-units, 

  • DAC Autonomous Organizations, 

  • Apex Cooperative Organizations 

  • State Agriculture Departments 

  • NCT/UT Agriculture Departments 

  • District Agriculture Offices and 

  • Block Agriculture offices 

In this direction, IFFCO has taken up a project in association with Indian Space Research Organisation (ISRO) to utilise satellite based remote sensing data and Geographical Information Systems (GIS). Attention may be drawn to the fact that the developed countries have been utilising precision farming with the help of IT tools for a long time. While this will take a long time for our country due to small holdings, it is to be noted that GIS has an invaluable role to play even in the existing conditions. Remote sensing and GIS information can provide warnings on evolving crop stresses, crop vigour, etc.

The IFFCO-ISRO GIS project extends support for efficient and timely availability of IFFCO’s fertiliser to farmers though better logistics & efficient operations. It endeavours to provide farmers’ advisory services to provide decision support to farmers on land related issues, weather, etc. In addition to the GIS based services, effort is being made to create databases that contain information of interest to the farmers. These include recommendation on package of practices for major cereals, pulses, horticulture, floriculture and animal husbandry, etc.Information on all the inputs such as seeds, fertiliser, sources, current availability, prices, availability of credit, alternatives available and terms and conditions, etc. are sought to be provided. An important service envisaged is to provide access to the nearest expert in case of stress or any other problem witnessed in the crops. Facilities are sought to be provided to encourage and share farm experiences by forging various crop forums. Many of the agricultural extension services are also proposed to be made online using aspects of multimedia.

In order to encourage farmers to obtain best possible price, information on various agricultural output markets (mantis) is also being provided. The objective of this activity is to provide status of price at different mandies to facilitate farmer to move his produce to the mandi where he can expect better price. Other areas of interest to farmers such as distance education, location specific news, etc. are also planned. Access to other related sites of interest such those relating to courts, health, etc. are also sought to be provided.

  

6.  Conclusions

Analytical exploration of vast amount of agricultural data can best be support by appropriate application of Data Warehousing and OLAP technologies. A Data Warehouse provides efficient and reliable structure of storage for vast amount data while OLAP techniques provide mechanisms for analysis of this data.

 

7.  References

[1] Data warehouse and its applications in Agriculture, Anil Rai, Indian Agricultural Statistics Research Institute Library Avenue, New Delhi.

[2] Information Technology in Agriculture, S.C. Mittal.

[3] Data Warehousing concepts, Techniques, Products and Applications, C.S.R.Prabhu.

 

  

Video about warehouse

uuuh! subscribe!

Question about warehouse

How many TVs should be shipped to each warehouse to minimize the cost? What is the minimum cost?
A manufacturer of HDTVs must ship at least 100 TVs to its two West Coast warehouses. Each warehouse holds a maximum of 100 TVs. Northern warehouse already has 25 TVs on hand. Southern warehouse already has 20 TVs on hand. It cost $12 to ship a TV to the nothern warehouse and $10 to ship a TV to the southern warehouse. Union rules require that at least 300 workers be hired. Shipping a TV to northern warehouse requires 4 workers, while shipping a TV to southern warehouse requires 2 workers.

Share and Enjoy:
  • Print
  • Digg
  • Sphinn
  • del.icio.us
  • Facebook
  • Mixx
  • Google Bookmarks
  • Blogplay

Related posts:

  1. About Warehouse Receipts Finance Warehouse receipts are a crucial element for risk mitigation,...
  2. Warehouse Management guide Warehouse management is the art of movement and storage...
  3. Warehouse Control System Benefits Profiled at Seminar Hosted by TriFactor, LLC, distribution professionals attended the seminar,...
  4. Importance of Storage Warehouses Storage warehousing is of great importance to the fast...
  5. How to Buy (not waste money on) Real Estate Investment Software Being able to purchase quality real estate investment software...

Related posts brought to you by Yet Another Related Posts Plugin.

Tags: , , , , , , , ,
Category : Investment

18 Responses to “Data Warehouse And Its Applications In Agriculture”

  1. I believe this is a type of question I've seen in a linear programing course. First, define your variables:

    x- number of tv's to ship to Northern
    y- number of tv's to ship to Southern

    Then you must construct a set of inequalities that describes the specifications in your problem.

    Ship at least 100 -> x+y>=100 (that is greater than or equal to 100)

    Northern already has 25, but can't hold more than 100, which means that they can't take more than 75 -> x<=75.

    Southern already has 20, but can't hold more than 100, which means that they can't take more than 80 -> y<=80.

    This bit about the minimum number of workers, northern needs 4 workers per TV and southern needs two workers per TV and there cannot be less than 300 workers means we get the following inequality:

    4x+2y>=300

    Graphing all the inequalities on the xy-plane yields a region of possible solutions that meets all of your requirements. To find the solution that minimizes cost, there is a theorem that states it must be a point in one of the corners of this region. If I did my calculations correct, the configuration that minimizes cost has to be one of the following:

    x=35, y=80
    x=50, y=50
    x=75, y=25
    x=75, y=80

    Just figure the cost on each of these. The one with the smallest cost is your answer. Hope this helps.

  2. Wordpress says:

    I have been using a system for over a year now, and this has completely changed my lives! Now money isn’t a problem for me any more, and because it takes only very little time to trade using the system, i also have plenty of time for me. The system is very simple to use and I can honestly say that anyone can succeed financially with it. Learn how to make it at [ MakeMoney8SScom ](replace SS with a dot)

  3. Girl 2 says:

    You may please explore,

    http://www.alibaba.com

    You are likely to get lot of information, for your business interest in Taiwan as well as from that area.

    With all the Best.

  4. WPMixer says:

    like a Fuax News rally

  5. truth says:

    How do you top this performance?

  6. Brian Vu says:

    They do use the middle tension. Though, I have gotte racquets from them strung a pound or 2 tighter. Just expect it to be in the middle.

  7. rails says:

    How do you top this performance?

  8. Free Blog says:

    love it when dave and carter look at each other at about 4:24 :D

  9. nacao says:

    BEST EVER.

  10. crocery says:

    Funny, you should ask. I am in fact a distributor of the Utility Warehouse Discount Club.

    Before continuing, I would like to draw your attention to my blog: http://www.hardnosedandlazy.co.uk, which discusses the advantages of joining the club.

    Finding the right distributor is very important in getting a good service from the Utility Warehouse Discount Club. The way the company works means, that the distributor who convinced you join the club is equally responsible in helping you sort out any problems you may encounter. The company is generally leagues above other utility providers in looking after their customers, with 90% of calls being answered within 15 seconds.
    Common reasons for people (there are 230,00 customers in the UK) joining the club are because of their ability to provide good value utilities (their average energy prices are the lowest over the last 5 years), and because it can help people manage their household bills a lot easier. The club can do this by allowing you to get all of your household bills sent to you in one envelope each month, rather than 4 or 5.

    So, how can you join? Just go to http://www.tinyurl.com/smartchoice, and you can start saving money today.

  11. Nate says:

    Yes it is insulation. likely to be vermiculite, not asbestos, therefore not harmful.

  12. WPBlog Shop says:

    i really love the start of this song, the way it builds up, such dmb style…rip leroi your touch will be missed

  13. Josh K. says:

    ask building contractors for quotes.

  14. Xplosys says:

    Ok, this might be a bit of a stretch but you asked:)
    You're located in the business district right? Are there people who bike to work in the area. If so, you could install iron bike racks and rent them out by the month to people who want to keep their bikes locked in a secure facility when they're working.

    Just like the Self Storage places, they will need their own locks and no bikes can be left in the facility after your normal closing hours when the warehouse is locked up. Won't work for places that get snow and ice.

    It's an idea starter.

  15. guzen says:

    “He would put that horn in his mouth and make the most astoninglishy honest music that could knock you over and it would sink right to the middle of you.” -Dave Matthews -RIP LeRoi Moore-

  16. Blogger says:

    and not to forget this outstanding sax-solo starting in 4:38 (RiP LeRoi)…omg, I just love this band ;)

  17. Transportation costs are dictated by supply of trucks in the area, cost of fuel, distance, and whether the truck would have a return load or would be deadheading to the origin. Costs also vary by area of the country, if short or long haul, etc..

    Typically, an estimate would be a 20% surcharge on top of regular dry van service as a general rule. Use $2 a mile and you'd be safe.

  18. I lived in Hawaii for years and they had SPAM and RICE in storage for years

Leave a Reply

icon_wink.gif icon_neutral.gif icon_mad.gif icon_twisted.gif icon_smile.gif icon_eek.gif icon_sad.gif icon_rolleyes.gif icon_razz.gif icon_redface.gif icon_surprised.gif icon_mrgreen.gif icon_lol.gif icon_idea.gif icon_biggrin.gif icon_evil.gif icon_cry.gif icon_cool.gif icon_arrow.gif icon_confused.gif icon_question.gif icon_exclaim.gif