Automating Web Query import

R

Roger on Excel

I use hyperlinks for chemicals that access a webpage for the individual
compounds showing tables of data for that chemical.

I have a large list of chemicals and I want to automate the downloading of
specific data in the tables from the web pages.

The chemical data is stored in the same format of tables on each webpage.

Can anyone help?

Thanks,

Roger
 
J

Joel

Ther are tow basic mathods that can be used.

1) Perform a web query. First setup a Reord Macro (Tools - Macro - Start
Recording).. then perform one query manually by going to Data - Import
External Data - New Web Query. Next modify the the recorded macro as
required to add a loop changing the chemicals and the destination location so
the data doesn't over-write each other.

If one above doesn't work

2) Open an Internet Explorer application in Excel. through the Internet
Explorer request each chemical and extract each results through the Internet
Explorer Application.

Let me know which approach you want to use. I can help with both approaches.
 
J

Joel

Thhe code below was very simple. I did it in about 15 minutes. It gets the
names of all the chemicals. I will work on the rest later.


Sub Getchemicals()

Set TempSht = Sheets.Add(after:=Sheets(Sheets.Count))
TempSht.Name = "Temp"
Set ChemicalSht = Sheets.Add(after:=Sheets(Sheets.Count))
ChemicalSht.Name = "Summary"

URLFolder =
"URL;http://www.ilo.org/public/english/protection/safework/cis/products/icsc/dtasht/"
ChemicalRowCount = 1
For i = 0 To 25
AlphaLetter = Chr(Asc("a") + i)

TempSht.Cells.ClearContents

With TempSht.QueryTables.Add(Connection:= _
URLFolder & AlphaLetter & "_index.htm", _
Destination:=TempSht.Range("A1"))

.Name = "a_index"
.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.BackgroundQuery = True
.RefreshStyle = xlInsertDeleteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.WebSelectionType = xlEntirePage
.WebFormatting = xlWebFormattingNone
.WebPreFormattedTextToColumns = True
.WebConsecutiveDelimitersAsOne = True
.WebSingleBlockTextImport = False
.WebDisableDateRecognition = False
.WebDisableRedirections = False
.Refresh BackgroundQuery:=False
End With

'move data from tempory sheet to chemical sheet
TempRowCount = 17
Do While Range("C" & TempRowCount) <> ""
ChemicalSht.Range("A" & ChemicalRowCount) = _
TempSht.Range("C" & TempRowCount)

ChemicalRowCount = ChemicalRowCount + 1
TempRowCount = TempRowCount + 1
Loop
Next i

TempSht.Cells.ClearContents

End Sub
 
J

Joel

I had to switch to method 2 to get each chemical webpage. Try this code.
Which data do you need? try using method one manually (Data - Import
External Data - New Web Query) with one of the webpages from the code below
and select one or more tables and see if the results are usable.

I will help as required.


Sub Getchemicals2()


Set ChemicalSht = Sheets.Add(after:=Sheets(Sheets.Count))
ChemicalSht.Name = "Chemicals"

Set IE = CreateObject("InternetExplorer.Application")
IE.Visible = True

URLFolder = _

"http://www.ilo.org/public/english/protection/safework/cis/products/icsc/dtasht/"
ChemicalRowCount = 1
For Letters = 0 To 25
AlphaLetter = Chr(Asc("a") + Letters)

URL = URLFolder & AlphaLetter & "_index.htm"

'get web page
IE.Navigate2 URL
Do While IE.readyState <> 4
DoEvents
Loop

Do While IE.busy = True
DoEvents
Loop

H2Found = False
For Each itm In IE.document.all
If H2Found = False Then
If itm.tagname = "H2" Then
H2Found = True
End If
Else

If itm.tagname = "A" Then
If itm.innertext = "" Then Exit For

'chemical name
ChemicalSht.Range("A" & ChemicalRowCount) = itm.innertext
'webpage
ChemicalSht.Range("B" & ChemicalRowCount) = itm.href

ChemicalRowCount = ChemicalRowCount + 1
End If
End If
Next itm

Next Letters


End Sub
 
R

Roger on Excel

Joel,

This is excellent.

Thanks for your advice.

How would I download other information from other tables - for example I
would like to input further data from the tables into further adjacent
columns so one can use the table as a vlookup source?

I see you use references like itm.innertext - does this select individual
line items from the tables on the web page?

Thanks, Roger
 
J

Joel

Right now I'm not sure which is the better method to use to get the data from
the tables. It depends on which items from the tables you ae looking for. I
used method 1 (web query) to download everything from the table. Some of the
data was hard to manipulate into ccolumn format on a spreadsheet. We would
have to do something like I did to get the chemical names. first perform a
query into a temporary page and then move the data to a summary page.

When I work with the Internet Explorer it is harder to figure out how to get
the data, but the coding usually is simplier. I use a few tricks to get the
data.

1) I look at the source code for the webpage by using the Internet Explorer
menu View - Source. I first look for the data I'm trying to extract. I look
in particular for tags. a tag with look like the following


<abc
.......................................................................
/abc>

The tag would be "abc". The tags are usually nested and there is always an
opening (<) and closing (>) character. The closing marker. The closing
marker may not have the tag name and may just lmay be a forward slah followed
by an angle bracket (/>).

I also look for classname which are in the source

id="msviRegionId"

The Clssname is always id= followed by the name in double quotes.

Using code you can get these items using the folowwing two statements

Set ClassB = IE.document.getelementsbytagname("abc")

Set RegionID = IE.document.getElementById("msviRegionId")

Each will return multiple items and to get each item you can use a look like
this

for each itm in ClassB
'add code here
next itm

2) The data is always in the innertext property. It depends how the data is
organized which method(s) I use to get the data. I usually dump the data
first to a spreadsheet using the following code

RowCount = 1
for each itm in IE.Document.All
Range("A" & RowCount) = itm.tagname
Range("B" & RowCount) = itm.classname
Range("C" & RowCount) = itm.innertext
RowCount = RowCount + 1
next itm

Note: innertext can sometimes by very long a cause memory errors. I
sometimes have to limit the data using left
Range("B" & RowCount) = left(itm.classname,1024)


or

RowCount = 1
for each itm in ClassB
Range("A" & RowCount) = itm.tagname
Range("B" & RowCount) = itm.classname
Range("C" & RowCount) = itm.innertext
RowCount = RowCount + 1
next itm

3) I also you break point and watch items. I will put a break in one of the
FOR loop above by left click with mouse on the line of code to bring the
cursor to the line. Then Pressing F9 to add Break Point.

Then I add a WATCH item for debugging. I will highlight ITM with the mouse
and then right click. Next I will select ADD WATCH and press OK on dialog
window to add the watch. Finally, I will press the plus (+) sign on the
watch window to see the data. Sometimes you can find the data in linked
lists under "children". There are other properties that are sometimes useful
like href (I used thsi to get the URL addresses of the chemicals).
 
R

Roger on Excel

Thanks for looking at this Joel,

I will take a look at what you suggest later today. It sounds quite
complicated, but I am keen to learn more so will work through your
suggestions and see what I get.

Best regards,

Roger
 
J

Joel

The best way of doing this way to use Internet Explorer to get the chemical
names and webpages. It was better to use web query to get the actual data.
Here is all the code. It was mostly copying and modifying the same code over
and over and over again.

I have an extra line of code "Exit For" to allow only the first chemical to
run. Make sure you verify what I did before removing line and running 5600
chemicals


Sub Getchemicals2()

Found = False
For Each sht In Sheets
If sht.Name = "Chemicals" Then
Found = True
Exit For
End If
Next sht
If Found = False Then
Set ChemicalSht = Sheets.Add(after:=Sheets(Sheets.Count))
ChemicalSht.Name = "Chemicals"
Else
Sheets("Chemicals").Cells.ClearContents
End If


Set IE = CreateObject("InternetExplorer.Application")
IE.Visible = True

URLFolder = _

"http://www.ilo.org/public/english/protection/safework/cis/products/icsc/dtasht/"
ChemicalRowCount = 1
For Letters = 0 To 25
AlphaLetter = Chr(Asc("a") + Letters)

URL = URLFolder & AlphaLetter & "_index.htm"

'get web page
IE.Navigate2 URL
Do While IE.readyState <> 4
DoEvents
Loop

Do While IE.busy = True
DoEvents
Loop

H2Found = False
For Each itm In IE.document.all
If H2Found = False Then
If itm.tagname = "H2" Then
H2Found = True
End If
Else

If itm.tagname = "A" Then
If itm.innertext = "" Then Exit For

'chemical name
ChemicalSht.Range("A" & ChemicalRowCount) = itm.innertext
'webpage
ChemicalSht.Range("B" & ChemicalRowCount) = itm.href

ChemicalRowCount = ChemicalRowCount + 1
End If
End If
Next itm

Next Letters


End Sub

Sub GetData()

Found = False
For Each sht In Sheets
If sht.Name = "Temp" Then
Found = True
Exit For
End If
Next sht
If Found = False Then
Set TempSht = Sheets.Add(after:=Sheets(Sheets.Count))
TempSht.Name = "Temp"
Else
Set TempSht = Sheets("Temp")
TempSht.Cells.ClearContents
End If

Found = False
For Each sht In Sheets
If sht.Name = "Data" Then
Found = True
Exit For
End If
Next sht
If Found = False Then
Set DataSht = Sheets.Add(after:=Sheets(Sheets.Count))
DataSht.Name = "Data"
Else
Set DataSht = Sheets("Data")
DataSht.Cells.ClearContents
End If

Call MakeHeaders


Set ChemicalSht = Sheets("Chemicals")
With ChemicalSht
LastRow = .Range("A" & Rows.Count).End(xlUp).Row
Set Chemicals = .Range("A1:A" & LastRow)

NewRowCount = 4
For Each Chemical In Chemicals
TempSht.Cells.ClearContents

With TempSht.QueryTables.Add(Connection:= _
"URL;" & Chemical.Offset(0, 1), _
Destination:=TempSht.Range("A1"))


.Name = "Temp"
.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.BackgroundQuery = True
.RefreshStyle = xlInsertDeleteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.WebSelectionType = xlEntirePage
.WebFormatting = xlWebFormattingNone
.WebPreFormattedTextToColumns = True
.WebConsecutiveDelimitersAsOne = True
.WebSingleBlockTextImport = False
.WebDisableDateRecognition = False
.WebDisableRedirections = False
.Refresh BackgroundQuery:=False
End With

Call MoveData(Chemical, NewRowCount)
NewRowCount = NewRowCount + 1
'------------------------------------- Remove Line
------------------------------------------------
Exit Fo
'--------------------------------------------------------------------------------------------------
Next Chemical
End With
DataSht.Columns("A:AQ").AutoFit
DataSht.Columns("AG").ColumnWidth = 50

DataSht.Rows("1:" & NewRowCount).VerticalAlignment = xlTop
End Sub



Sub MakeHeaders()

With Sheets("Data")
.Range("A1") = "Chemical Name"
.Range("B1") = "Generic Name(s)"
.Range("C1") = "CAS No"
.Range("D1") = "RTECS No"
.Range("E1") = "UN No"
.Range("F1") = "EC No"
.Range("G1") = "Alternate Names"
.Columns("G").WrapText = True
.Range("H1") = "Molecular Mass"

.Range("I1:K1").MergeCells = True
.Range("I1") = "Fire Hazard"
.Range("I1").HorizontalAlignment = xlCenter
.Range("I2") = "Acute Hazard/Symptoms"
.Range("J2") = "Prevention"
.Range("K2") = "First Aid/Fire Fighting"

.Range("L1:N1").MergeCells = True
.Range("L1") = "Explosion Hazard"
.Range("L1").HorizontalAlignment = xlCenter
.Range("L2") = "Acute Hazard/Symptoms"
.Range("M2") = "Prevention"
.Range("N2") = "First Aid/Fire Fighting"

.Range("O1:Q1").MergeCells = True
.Range("O1") = "Exposure"
.Range("O1").HorizontalAlignment = xlCenter
.Range("O2") = "Acute Hazard/Symptoms"
.Range("P2") = "Prevention"
.Range("Q2") = "First Aid/Fire Fighting"

.Range("R1:T1").MergeCells = True
.Range("R1") = "Inhalation Exposure"
.Range("R1").HorizontalAlignment = xlCenter
.Range("R2") = "Acute Hazard/Symptoms"
.Range("S2") = "Prevention"
.Range("T2") = "First Aid/Fire Fighting"

.Range("U1:W1").MergeCells = True
.Range("U1") = "Skin Exposure"
.Range("U1").HorizontalAlignment = xlCenter
.Range("U2") = "Acute Hazard/Symptoms"
.Range("V2") = "Prevention"
.Range("W2") = "First Aid/Fire Fighting"

.Range("X1:Z1").MergeCells = True
.Range("X1") = "Eyes Exposure"
.Range("X1").HorizontalAlignment = xlCenter
.Range("X2") = "Acute Hazard/Symptoms"
.Range("Y2") = "Prevention"
.Range("Z2") = "First Aid/Fire Fighting"

.Range("AA1:AC1").MergeCells = True
.Range("AA1") = "Ingestion Exposure"
.Range("AA1").HorizontalAlignment = xlCenter
.Range("AA2") = "Acute Hazard/Symptoms"
.Range("AB2") = "Prevention"
.Range("AC2") = "First Aid/Fire Fighting"

.Range("AD1") = "Spillage Disposal"
.Range("AE1") = "Packaging and Labelling"
.Columns("AE").WrapText = True
.Range("AF1") = "Emergency Response"
.Range("AG1") = "Safe Storage"
.Columns("AG").WrapText = True

.Range("AH1") = "Physical State; Appearance"
.Range("AI1") = "Routes of exposure"
.Range("AJ1") = "Chemical dangers"
.Range("AK1") = "Inhalation risk"
.Range("AL1") = "Occupational exposure limits"
.Range("AM1") = "Effects of short-term exposure"
.Range("AN1") = "Effects of long-term or repeated exposure"

.Range("AO1") = "PHYSICAL PROPERTIES"
.Range("AP1") = "ENVIRONMENTAL DATA"
.Range("AQ1") = "NOTES"

.Columns("A:AQ").AutoFit
End With
End Sub




Sub MoveData(Chemical, RowCount)

Set DataSht = Sheets("Data")
'Use ICSC: to get chemical names
With Sheets("Temp")
DataSht.Range("A" & RowCount) = Chemical

Set c = .Columns("B").Find(what:="ICSC:", LookIn:=xlValues, lookat:=xlPart)
If c Is Nothing Then
MsgBox ("Error - Could not find ISCS for Chemical : " & Chemical)
Stop
Else
'Move Generic Name
DataSht.Range("B" & RowCount) = c.Offset(0, -1).Value
End If

'Use ISCS Number to find first Row of Alternate Names
FirstAlternateRow = c.Row + 2

Set c = .Columns("A").Find(what:="CAS No:", LookIn:=xlValues,
lookat:=xlPart)
If c Is Nothing Then
MsgBox ("Error - Could not find CAS Number for Chemical : " & Chemical)
Stop
Else
'Move Generic Name
DataSht.Range("C" & RowCount) = Trim(Mid(c.Value, InStr(c.Value, ":")
+ 1))
End If

'Use CAS Number to find Last Row of Alternate Names
LastAlternateRow = c.Row - 1
'Get Alternate Names
Alternate = ""
For TempRowCount = FirstAlternateRow To LastAlternateRow
If Alternate = "" Then
Alternate = .Range("A" & TempRowCount)
Else
Alternate = Alternate & Chr(10) & .Range("A" & TempRowCount)
End If
Next TempRowCount

'Move Alternate Name
DataSht.Range("G" & RowCount) = Alternate

Set c = .Columns("A").Find(what:="RTECS No:", LookIn:=xlValues,
lookat:=xlPart)
If c Is Nothing Then
MsgBox ("Error - Could not find RTECS Number for Chemical : " &
Chemical)
Stop
Else
'Move RTECS
DataSht.Range("D" & RowCount) = Trim(Mid(c.Value, InStr(c.Value, ":")
+ 1))
End If

Set c = .Columns("A").Find(what:="UN No:", LookIn:=xlValues,
lookat:=xlPart)
If c Is Nothing Then
MsgBox ("Error - Could not find UN Number for Chemical : " & Chemical)
Stop
Else
'Move UN No
DataSht.Range("E" & RowCount) = Trim(Mid(c.Value, InStr(c.Value, ":")
+ 1))
End If

Set c = .Columns("A").Find(what:="EC No:", LookIn:=xlValues,
lookat:=xlPart)
If c Is Nothing Then
MsgBox ("Error - Could not find EC Number for Chemical : " & Chemical)
Stop
Else
'Move EC No
DataSht.Range("F" & RowCount) = Trim(Mid(c.Value, InStr(c.Value, ":")
+ 1))
End If

Set c = .Columns("C").Find(what:="Molecular mass:", LookIn:=xlValues,
lookat:=xlPart)
If c Is Nothing Then
MsgBox ("Error - Could not find Molecular for Chemical : " & Chemical)
Stop
Else
'Move molecular mass
DataSht.Range("G" & RowCount) = Trim(Mid(c.Value, InStr(c.Value, ":")
+ 1))
End If

Set c = .Columns("A").Find(what:="FIRE", LookIn:=xlValues, lookat:=xlPart)
If c Is Nothing Then
MsgBox ("Error - Could not find FIRE Hazard for Chemical : " & Chemical)
Stop
Else
'Move Fire hazard
DataSht.Range("I" & RowCount) = c.Offset(0, 1).Value
DataSht.Range("J" & RowCount) = c.Offset(0, 2).Value
DataSht.Range("K" & RowCount) = c.Offset(0, 3).Value
End If

Set c = .Columns("A").Find(what:="EXPLOSION", LookIn:=xlValues,
lookat:=xlPart)
If c Is Nothing Then
MsgBox ("Error - Could not find Explosion Hazard for Chemical : " &
Chemical)
Stop
Else
'Move Explosion Hazard
DataSht.Range("L" & RowCount) = c.Offset(0, 1).Value
DataSht.Range("M" & RowCount) = c.Offset(0, 2).Value
DataSht.Range("N" & RowCount) = c.Offset(0, 3).Value
End If

Set c = .Columns("A").Find(what:="EXPOSURE", LookIn:=xlValues,
lookat:=xlPart)
If c Is Nothing Then
MsgBox ("Error - Could not find Exposure for Chemical : " & Chemical)
Stop
Else
'find 2nd occurance
Set c = .Columns("A").FindNext(after:=c)

'Move Exposure
DataSht.Range("O" & RowCount) = c.Offset(0, 1).Value
DataSht.Range("P" & RowCount) = c.Offset(0, 2).Value
DataSht.Range("Q" & RowCount) = c.Offset(0, 3).Value
End If

Set c = .Columns("A").Find(what:="Inhalation", LookIn:=xlValues,
lookat:=xlPart)
If c Is Nothing Then
MsgBox ("Error - Could not find Inhalation Exposure for Chemical : " &
Chemical)
Stop
Else
'Move Inhalation Exposure
DataSht.Range("R" & RowCount) = c.Offset(0, 1).Value
DataSht.Range("S" & RowCount) = c.Offset(0, 2).Value
DataSht.Range("T" & RowCount) = c.Offset(0, 3).Value
End If

Set c = .Columns("A").Find(what:="Skin", LookIn:=xlValues, lookat:=xlPart)
If c Is Nothing Then
MsgBox ("Error - Could not find Skin Exposure for Chemical : " &
Chemical)
Stop
Else
'Move Skin Exposure
DataSht.Range("U" & RowCount) = c.Offset(0, 1).Value
DataSht.Range("V" & RowCount) = c.Offset(0, 2).Value
DataSht.Range("W" & RowCount) = c.Offset(0, 3).Value
End If

Set c = .Columns("A").Find(what:="Eyes", LookIn:=xlValues, lookat:=xlPart)
If c Is Nothing Then
MsgBox ("Error - Could not find Eyes Exposure for Chemical : " &
Chemical)
Stop
Else
'Move Eyes Exposure
DataSht.Range("X" & RowCount) = c.Offset(0, 1).Value
DataSht.Range("Y" & RowCount) = c.Offset(0, 2).Value
DataSht.Range("Z" & RowCount) = c.Offset(0, 3).Value
End If

Set c = .Columns("A").Find(what:="Ingestion", LookIn:=xlValues,
lookat:=xlPart)
If c Is Nothing Then
MsgBox ("Error - Could not find Ingestion Exposure for Chemical : " &
Chemical)
Stop
Else
'Move Ingestion Exposure
DataSht.Range("AA" & RowCount) = c.Offset(0, 1).Value
DataSht.Range("AB" & RowCount) = c.Offset(0, 2).Value
DataSht.Range("AC" & RowCount) = c.Offset(0, 3).Value
End If

Set c = .Columns("A").Find(what:="SPILLAGE DISPOSAL", LookIn:=xlValues,
lookat:=xlPart)
If c Is Nothing Then
MsgBox ("Error - Could not find SPILLAGE DISPOSAL for Chemical : " &
Chemical)
Stop
Else
'Move SPILLAGE DISPOSAL
DataSht.Range("AD" & RowCount) = c.Offset(1, 0).Value
End If

Set c = .Columns("B").Find(what:="PACKAGING & LABELLING",
LookIn:=xlValues, lookat:=xlPart)
If c Is Nothing Then
MsgBox ("Error - Could not find PACKAGING & LABELLING for Chemical : "
& Chemical)
Stop
Else
'Move PACKAGING & LABELLING
Packaging = ""
TempRowCount = c.Row + 1
Do While .Range("B" & TempRowCount) <> ""
If Packaging = "" Then
Packaging = .Range("B" & TempRowCount)
Else
Packaging = Packaging & Chr(10) & .Range("B" & TempRowCount)
End If

TempRowCount = TempRowCount + 1
Loop
DataSht.Range("AE" & RowCount) = Packaging
End If

Set c = .Columns("A").Find(what:="EMERGENCY RESPONSE", LookIn:=xlValues,
lookat:=xlPart)
If c Is Nothing Then
MsgBox ("Error - Could not find EMERGENCY RESPONSE for Chemical : " &
Chemical)
Stop
Else
'Move EMERGENCY RESPONSE
DataSht.Range("AF" & RowCount) = c.Offset(1, 0).Value
End If

Set c = .Columns("B").Find(what:="SAFE STORAGE", LookIn:=xlValues,
lookat:=xlPart)
If c Is Nothing Then
MsgBox ("Error - Could not find Safe Storage for Chemical : " &
Chemical)
Stop
Else
'Move Safe Storage
DataSht.Range("AG" & RowCount) = c.Offset(1, 0).Value
End If

Set c = .Columns("A").Find(what:="Physical State; Appearance",
LookIn:=xlValues, lookat:=xlPart)
If c Is Nothing Then
MsgBox ("Error - Could not find Physical State; Appearance for
Chemical : " & Chemical)
Stop
Else
'Move Physical State; Appearance
Appearance = ""
TempRowCount = c.Row + 1
Do While .Range("A" & TempRowCount) <> ""
If Appearance = "" Then
Appearance = .Range("A" & TempRowCount)
Else
Appearance = Appearance & Chr(10) & .Range("A" & TempRowCount)
End If

TempRowCount = TempRowCount + 1
Loop
DataSht.Range("AH" & RowCount) = Appearance
End If

Set c = .Columns("B").Find(what:="Routes of exposure", LookIn:=xlValues,
lookat:=xlPart)
If c Is Nothing Then
MsgBox ("Error - Could not find Routes of exposure for Chemical : " &
Chemical)
Stop
Else
'Move Routes of exposure
Routes = ""
TempRowCount = c.Row + 1
Do While .Range("B" & TempRowCount) <> ""
If Routes = "" Then
Routes = .Range("B" & TempRowCount)
Else
Routes = Routes & Chr(10) & .Range("A" & TempRowCount)
End If

TempRowCount = TempRowCount + 1
Loop
DataSht.Range("AI" & RowCount) = Routes
End If

Set c = .Columns("A").Find(what:="Chemical dangers", LookIn:=xlValues,
lookat:=xlPart)
If c Is Nothing Then
MsgBox ("Error - Could not find Chemical dangers for Chemical : " &
Chemical)
Stop
Else
'Move Chemical dangers
Dangers = ""
TempRowCount = c.Row + 1
Do While .Range("A" & TempRowCount) <> ""
If Dangers = "" Then
Dangers = .Range("A" & TempRowCount)
Else
Dangers = Dangers & Chr(10) & .Range("A" & TempRowCount)
End If

TempRowCount = TempRowCount + 1
Loop
DataSht.Range("AJ" & RowCount) = Dangers
End If

Set c = .Columns("B").Find(what:="Inhalation risk", LookIn:=xlValues,
lookat:=xlPart)
If c Is Nothing Then
MsgBox ("Error - Could not find Inhalation risk for Chemical : " &
Chemical)
Stop
Else
'Move Inhalation risk
Inhalation = ""
TempRowCount = c.Row + 1
Do While .Range("B" & TempRowCount) <> ""
If Inhalation = "" Then
Inhalation = .Range("B" & TempRowCount)
Else
Inhalation = Inhalation & Chr(10) & .Range("A" & TempRowCount)
End If

TempRowCount = TempRowCount + 1
Loop
DataSht.Range("AK" & RowCount) = Inhalation
End If

Set c = .Columns("A").Find(what:="Occupational exposure limits",
LookIn:=xlValues, lookat:=xlPart)
If c Is Nothing Then
MsgBox ("Error - Could not find Occupational exposure limits for
Chemical : " & Chemical)
Stop
Else
'Move Occupational exposure limits
Occupational = ""
TempRowCount = c.Row + 1
Do While .Range("A" & TempRowCount) <> ""
If Occupational = "" Then
Occupational = .Range("A" & TempRowCount)
Else
Occupational = Occupational & Chr(10) & .Range("A" & TempRowCount)
End If

TempRowCount = TempRowCount + 1
Loop
DataSht.Range("AL" & RowCount) = Occupational
End If

Set c = .Columns("B").Find(what:="Effects of short-term exposure",
LookIn:=xlValues, lookat:=xlPart)
If c Is Nothing Then
MsgBox ("Error - Could not find Effects of short-term exposure for
Chemical : " & Chemical)
Stop
Else
'Move Effects of short-term exposure
ShortTerm = ""
TempRowCount = c.Row + 1
Do While .Range("B" & TempRowCount) <> ""
If ShortTerm = "" Then
ShortTerm = .Range("B" & TempRowCount)
Else
ShortTerm = ShortTerm & Chr(10) & .Range("A" & TempRowCount)
End If

TempRowCount = TempRowCount + 1
Loop
DataSht.Range("AM" & RowCount) = ShortTerm
End If

Set c = .Columns("B").Find(what:="Effects of long-term or repeated
exposure", LookIn:=xlValues, lookat:=xlPart)
If c Is Nothing Then
MsgBox ("Error - Could not find Effects of long-term or repeated
exposure for Chemical : " & Chemical)
Stop
Else
'Move Effects of long-term or repeated exposure
LongTerm = ""
TempRowCount = c.Row + 1
Do While .Range("B" & TempRowCount) <> ""
If LongTerm = "" Then
LongTerm = .Range("B" & TempRowCount)
Else
LongTerm = LongTerm & Chr(10) & .Range("A" & TempRowCount)
End If

TempRowCount = TempRowCount + 1
Loop
DataSht.Range("AN" & RowCount) = LongTerm
End If

Set c = .Columns("A").Find(what:="PHYSICAL PROPERTIES", LookIn:=xlValues,
lookat:=xlPart)
If c Is Nothing Then
MsgBox ("Error - Could not find PHYSICAL PROPERTIES for Chemical : " &
Chemical)
Stop
Else
'Move PHYSICAL PROPERTIES
Physical = ""
TempRowCount = c.Row + 1
Do While .Range("B" & TempRowCount) <> ""
If Physical = "" Then
Physical = .Range("B" & TempRowCount)
Else
Physical = Physical & Chr(10) & .Range("A" & TempRowCount)
End If

TempRowCount = TempRowCount + 1
Loop
DataSht.Range("AO" & RowCount) = Physical
End If

Set c = .Columns("B").Find(what:="ENVIRONMENTAL DATA", LookIn:=xlValues,
lookat:=xlPart)
If c Is Nothing Then
MsgBox ("Error - Could not find ENVIRONMENTAL DATA for Chemical : " &
Chemical)
Stop
Else
'Move ENVIRONMENTAL DATA
DataSht.Range("AP" & RowCount) = c.Offset(1, 0).Value
End If

Set c = .Columns("A").Find(what:="NOTES", LookIn:=xlValues, lookat:=xlPart)
If c Is Nothing Then
MsgBox ("Error - Could not find NOTES for Chemical : " & Chemical)
Stop
Else
'Move NOTES
DataSht.Range("AQ" & RowCount) = c.Offset(1, 0).Value
End If

End With


End Sub
 
J

Joel

I found one error. the molecular Mass was in the wrong column


from

'Move molecular mass
DataSht.Range("g" & RowCount) = Trim(Mid(c.Value, InStr(c.Value, ":")
+ 1))

to

'Move molecular mass
DataSht.Range("H" & RowCount) = Trim(Mid(c.Value, InStr(c.Value, ":")
+ 1))
 
J

Joel

I tested for a few more chemicals and found that not all chemicals have the
same properties. You have to remove some of the STOP and Error messages.
Also you may have to add more columns for properties I do not have coded.
 
R

Roger on Excel

Thanks so much for this Joel,

I will check it out tomorrow as im not feeling well this evening. I noticed
that some of the fields have more data than others - a tricky problem for
sure.

All the best,

Roger
 
J

Joel

The macro I provided should be carefully checked. I used only a couple of
chemicals to check my results. I have noticed a few problems shown below

For debugging and to run the code in pieces you can change these statement
in getData(). when you run the code on the full set of data I would run it
in sections. Maybe 500 to 1000 chemicals at a time. If the code ran 10
chemicals a minute, it would take about 10 hours to do all the chemicals.

from
LastRow = .Range("A" & Rows.Count).End(xlUp).Row
Set Chemicals = .Range("A1:A" & LastRow)
NewRowCount = 4
To

FirstRow = 1
LastRow = .Range("A" & Rows.Count).End(xlUp).Row
LastRow = 5
Set Chemicals = .Range("A" & FirstRow & ":A" & LastRow)
NewRowCount = FirstRow + 3 'where the first chemical is put on the
"Data" worksheet

Note: Remeber to take out the EXIT FOR statement which only allows one
chemical to get processed.


To run the Code first get chemicals running the macro Getchemicals2(). You
only need to run this macro once. Then run the Macro GetData() which will
use the webpages from Getchemicals2() to get the rest of the data.


1) Some items I'm look at only one row of data on the Temp sheet and others
I collected multiple rows by using a Do Loop and looking for the 1st blank
row to terminate. I noticed that on the "Notes" section I'm looking a only
one row where are some chemicals there are multiple rows for Notes.

Also Emergency response I'm only looking at 1 row and missing the NFPA
information.

2) I didn't include the chemical composition like C2H1O3. It is in the row
directly above the Molecular Mass

3) I don't know if I included all the data because I used only a couple of
chemicals as examples. There may be other properties (especially in the
Important Data) that I may have missed.

4) The PDF printout of the chemicals has a section on Additional Information
but I didn't see any chemicals with this information. Again I only looked at
a few chemicals.
 
J

Joel

I found some errors and didn't like how I was specifying the column. I
converted the column Leeter to constants so it would be easier to add, delete
and move columns. I also the following:

1) Molecular formula
2) Added to Emergeny Response getting multiple rows of data
3) Added to Notes getting multiple rows of data


Sub Getchemicals2()

Found = False
For Each sht In Sheets
If sht.Name = "Chemicals" Then
Found = True
Exit For
End If
Next sht
If Found = False Then
Set ChemicalSht = Sheets.Add(after:=Sheets(Sheets.Count))
ChemicalSht.Name = "Chemicals"
Else
Sheets("Chemicals").Cells.ClearContents
End If


Set IE = CreateObject("InternetExplorer.Application")
IE.Visible = True

URLFolder = _

"http://www.ilo.org/public/english/protection/safework/cis/products/icsc/dtasht/"
ChemicalRowCount = 1
For Letters = 0 To 25
AlphaLetter = Chr(Asc("a") + Letters)

URL = URLFolder & AlphaLetter & "_index.htm"

'get web page
IE.Navigate2 URL
Do While IE.readyState <> 4
DoEvents
Loop

Do While IE.busy = True
DoEvents
Loop

H2Found = False
For Each itm In IE.document.all
If H2Found = False Then
If itm.tagname = "H2" Then
H2Found = True
End If
Else

If itm.tagname = "A" Then
If itm.innertext = "" Then Exit For

'chemical name
ChemicalSht.Range("A" & ChemicalRowCount) = itm.innertext
'webpage
ChemicalSht.Range("B" & ChemicalRowCount) = itm.href

ChemicalRowCount = ChemicalRowCount + 1
End If
End If
Next itm

Next Letters


End Sub



Const ChemNameCol = 1
Const GenericNameCol = ChemNameCol + 1
Const CASNoCol = GenericNameCol + 1
Const RTECSNoCol = CASNoCol + 1
Const UNNoCol = RTECSNoCol + 1
Const ECNoCol = UNNoCol + 1
Const MolecularFormCol = ECNoCol + 1
Const AltNameCol = MolecularFormCol + 1
Const MoleMassCol = AltNameCol + 1

'Group of 3 Columns
Const FireHazCol = MoleMassCol + 1
Const ExplosHazCol = FireHazCol + 3
Const ExposureCol = ExplosHazCol + 3
Const InhalCol = ExposureCol + 3
Const SkinCol = InhalCol + 3
Const EyesCol = SkinCol + 3
Const IngestCol = EyesCol + 3

Const SpillDisposCol = IngestCol + 3
Const PackCol = SpillDisposCol + 1
Const EmergRespCol = PackCol + 1
Const SafeStorCol = EmergRespCol + 1

Const PhysStateCol = SafeStorCol + 1
Const RoutesCol = PhysStateCol + 1
Const ChemDangCol = RoutesCol + 1
Const InhalRiskCol = ChemDangCol + 1
Const OccupatCol = InhalRiskCol + 1
Const ShortTermCol = OccupatCol + 1
Const LongTermCol = ShortTermCol + 1
Const PhysicPropCol = LongTermCol + 1
Const EnvironCol = PhysicPropCol + 1
Const NoteCol = EnvironCol + 1
Const LastCol = NoteCol

Sub GetData()

Found = False
For Each sht In Sheets
If sht.Name = "Temp" Then
Found = True
Exit For
End If
Next sht
If Found = False Then
Set TempSht = Sheets.Add(after:=Sheets(Sheets.Count))
TempSht.Name = "Temp"
Else
Set TempSht = Sheets("Temp")
TempSht.Cells.ClearContents
End If

Found = False
For Each sht In Sheets
If sht.Name = "Data" Then
Found = True
Exit For
End If
Next sht
If Found = False Then
Set DataSht = Sheets.Add(after:=Sheets(Sheets.Count))
DataSht.Name = "Data"
Else
Set DataSht = Sheets("Data")
DataSht.Cells.ClearContents
End If

Call MakeHeaders



Set ChemicalSht = Sheets("Chemicals")
With ChemicalSht
FirstRow = 1
LastRow = .Range("A" & Rows.Count).End(xlUp).Row
LastRow = 5
Set Chemicals = .Range("A" & FirstRow & ":A" & LastRow)

NewRowCount = FirstRow + 3
For Each Chemical In Chemicals
TempSht.Cells.ClearContents

With TempSht.QueryTables.Add(Connection:= _
"URL;" & Chemical.Offset(0, 1), _
Destination:=TempSht.Range("A1"))


.Name = "Temp"
.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.BackgroundQuery = True
.RefreshStyle = xlInsertDeleteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.WebSelectionType = xlEntirePage
.WebFormatting = xlWebFormattingNone
.WebPreFormattedTextToColumns = True
.WebConsecutiveDelimitersAsOne = True
.WebSingleBlockTextImport = False
.WebDisableDateRecognition = False
.WebDisableRedirections = False
.Refresh BackgroundQuery:=False
End With

Call MoveData(Chemical, NewRowCount)
NewRowCount = NewRowCount + 1

Next Chemical
End With
DataSht.Columns("A:AQ").AutoFit
DataSht.Columns("AG").ColumnWidth = 50

DataSht.Rows("1:" & NewRowCount).VerticalAlignment = xlTop
End Sub
Sub MakeHeaders()

With Sheets("Data")

.Cells(1, ChemNameCol) = "Chemical Name"
.Cells(1, GenericNameCol) = "Generic Name(s)"
.Cells(1, CASNoCol) = "CAS No"
.Cells(1, RTECSNoCol) = "RTECS No"
.Cells(1, UNNoCol) = "UN No"
.Cells(1, ECNoCol) = "EC No"
.Cells(1, MolecularFormCol) = "Molucular Formula"
.Cells(1, AltNameCol) = "Alternate Names"
.Columns(AltNameCol).WrapText = True
.Cells(1, MoleMassCol) = "Molecular Mass"

.Range(.Cells(1, FireHazCol), .Cells(1, FireHazCol + 2)).MergeCells = True
.Cells(1, FireHazCol) = "Fire Hazard"
.Cells(1, FireHazCol).HorizontalAlignment = xlCenter
.Cells(2, FireHazCol) = "Acute Hazard/Symptoms"
.Cells(2, FireHazCol + 1) = "Prevention"
.Cells(2, FireHazCol + 2) = "First Aid/Fire Fighting"

.Range(.Cells(1, ExplosHazCol), .Cells(1, ExplosHazCol + 2)).MergeCells =
True
.Cells(1, ExplosHazCol) = "Explosion Hazard"
.Cells(1, ExplosHazCol).HorizontalAlignment = xlCenter
.Cells(2, ExplosHazCol) = "Acute Hazard/Symptoms"
.Cells(2, ExplosHazCol + 1) = "Prevention"
.Cells(2, ExplosHazCol + 2) = "First Aid/Fire Fighting"

.Range(.Cells(1, ExposureCol), .Cells(1, ExposureCol + 2)).MergeCells =
True
.Cells(1, ExposureCol) = "Exposure"
.Cells(1, ExposureCol).HorizontalAlignment = xlCenter
.Cells(2, ExposureCol) = "Acute Hazard/Symptoms"
.Cells(2, ExposureCol + 1) = "Prevention"
.Cells(2, ExposureCol + 2) = "First Aid/Fire Fighting"

.Range(.Cells(1, InhalCol), .Cells(1, InhalCol + 2)).MergeCells = True
.Cells(1, InhalCol) = "Inhalation Exposure"
.Cells(1, InhalCol).HorizontalAlignment = xlCenter
.Cells(2, InhalCol) = "Acute Hazard/Symptoms"
.Cells(2, InhalCol + 1) = "Prevention"
.Cells(2, InhalCol + 2) = "First Aid/Fire Fighting"

.Range(.Cells(1, SkinCol), .Cells(1, SkinCol + 2)).MergeCells = True
.Cells(1, SkinCol) = "Skin Exposure"
.Cells(1, SkinCol).HorizontalAlignment = xlCenter
.Cells(2, SkinCol) = "Acute Hazard/Symptoms"
.Cells(2, SkinCol + 1) = "Prevention"
.Cells(2, SkinCol + 2) = "First Aid/Fire Fighting"

.Range(.Cells(1, EyesCol), .Cells(1, EyesCol + 2)).MergeCells = True
.Cells(1, EyesCol) = "Eyes Exposure"
.Cells(1, EyesCol).HorizontalAlignment = xlCenter
.Cells(2, EyesCol) = "Acute Hazard/Symptoms"
.Cells(2, EyesCol + 1) = "Prevention"
.Cells(2, EyesCol + 2) = "First Aid/Fire Fighting"

.Range(.Cells(1, IngestCol), .Cells(1, IngestCol + 2)).MergeCells = True
.Cells(1, IngestCol) = "Ingestion Exposure"
.Cells(1, IngestCol).HorizontalAlignment = xlCenter
.Cells(2, IngestCol) = "Acute Hazard/Symptoms"
.Cells(2, IngestCol + 1) = "Prevention"
.Cells(2, IngestCol + 2) = "First Aid/Fire Fighting"

.Cells(1, SpillDisposCol) = "Spillage Disposal"
.Cells(1, PackCol) = "Packaging and Labelling"
.Columns(PackCol).WrapText = True
.Cells(1, EmergRespCol) = "Emergency Response"
.Cells(1, SafeStorCol) = "Safe Storage"
.Columns(SafeStorCol).WrapText = True

.Cells(1, PhysStateCol) = "Physical State; Appearance"
.Cells(1, RoutesCol) = "Routes of Exposure"
.Cells(1, ChemDangCol) = "Chemical Dangers"
.Cells(1, InhalRiskCol) = "Inhalation Risk"
.Cells(1, OccupatCol) = "Occupational exposure limits"
.Cells(1, ShortTermCol) = "Effects of short-term exposure"
.Cells(1, LongTermCol) = "Effects of long-term or repeated exposure"

.Cells(1, PhysicPropCol) = "PHYSICAL PROPERTIES"
.Cells(1, EnvironCol) = "ENVIRONMENTAL DATA"
.Cells(1, NoteCol) = "NOTES"

Range("A1:A" & LastCol).EntireColumn.AutoFit
End With

End Sub

Sub MoveData(Chemical, RowCount)


Set DataSht = Sheets("Data")
'Use ICSC: to get chemical names
With Sheets("Temp")
DataSht.Cells(RowCount, ChemNameCol) = Chemical

Set c = .Columns("B").Find(what:="ICSC:", LookIn:=xlValues, lookat:=xlPart)
If c Is Nothing Then
MsgBox ("Error - Could not find ISCS for Chemical : " & Chemical)
Stop
Else
'Move Generic Name
DataSht.Cells(RowCount, GenericNameCol) = c.Offset(0, -1).Value
End If

'Use ISCS Number to find first Row of Alternate Names
FirstAlternateRow = c.Row + 2

Set c = .Columns("A").Find(what:="CAS No:", LookIn:=xlValues,
lookat:=xlPart)
If c Is Nothing Then
MsgBox ("Error - Could not find CAS Number for Chemical : " & Chemical)
Stop
Else
'Move Generic Name
DataSht.Cells(RowCount, CASNoCol) = Trim(Mid(c.Value, InStr(c.Value,
":") + 1))
End If

'Use CAS Number to find Last Row of Alternate Names
LastAlternateRow = c.Row - 1
'Get Alternate Names
Alternate = ""
For TempRowCount = FirstAlternateRow To LastAlternateRow
If Alternate = "" Then
Alternate = .Range("A" & TempRowCount)
Else
Alternate = Alternate & Chr(10) & .Range("A" & TempRowCount)
End If
Next TempRowCount

'Move Alternate Name
DataSht.Cells(RowCount, AltNameCol) = Alternate

Set c = .Columns("A").Find(what:="RTECS No:", LookIn:=xlValues,
lookat:=xlPart)
If c Is Nothing Then
MsgBox ("Error - Could not find RTECS Number for Chemical : " &
Chemical)
Stop
Else
'Move RTECS
DataSht.Cells(RowCount, RTECSNoCol) = Trim(Mid(c.Value, InStr(c.Value,
":") + 1))
End If

Set c = .Columns("A").Find(what:="UN No:", LookIn:=xlValues,
lookat:=xlPart)
If c Is Nothing Then
MsgBox ("Error - Could not find UN Number for Chemical : " & Chemical)
'Stop
Else
'Move UN No
DataSht.Cells(RowCount, UNNoCol) = Trim(Mid(c.Value, InStr(c.Value,
":") + 1))
End If

Set c = .Columns("A").Find(what:="EC No:", LookIn:=xlValues,
lookat:=xlPart)
If c Is Nothing Then
MsgBox ("Error - Could not find EC Number for Chemical : " & Chemical)
Stop
Else
'Move EC No
DataSht.Cells(RowCount, ECNoCol) = Trim(Mid(c.Value, InStr(c.Value,
":") + 1))
End If

Set c = .Columns("C").Find(what:="Molecular mass:", LookIn:=xlValues,
lookat:=xlPart)
If c Is Nothing Then
MsgBox ("Error - Could not find Molecular for Chemical : " & Chemical)
'Stop
Else
'Move molecular mass
DataSht.Cells(RowCount, MoleMassCol) = Trim(Mid(c.Value,
InStr(c.Value, ":") + 1))
'Move Molecular Formula
DataSht.Cells(RowCount, MolecularFormCol) = c.Offset(-1, 0).Value
End If

Set c = .Columns("A").Find(what:="FIRE", LookIn:=xlValues, lookat:=xlPart)
If c Is Nothing Then
MsgBox ("Error - Could not find FIRE Hazard for Chemical : " & Chemical)
Stop
Else
'Move Fire hazard
DataSht.Cells(RowCount, FireHazCol) = c.Offset(0, 1).Value
DataSht.Cells(RowCount, FireHazCol + 1) = c.Offset(0, 2).Value
DataSht.Cells(RowCount, FireHazCol + 2) = c.Offset(0, 3).Value
End If

Set c = .Columns("A").Find(what:="EXPLOSION", LookIn:=xlValues,
lookat:=xlPart)
If c Is Nothing Then
MsgBox ("Error - Could not find Explosion Hazard for Chemical : " &
Chemical)
Stop
Else
'Move Explosion Hazard
DataSht.Cells(RowCount, ExplosHazCol) = c.Offset(0, 1).Value
DataSht.Cells(RowCount, ExplosHazCol + 1) = c.Offset(0, 2).Value
DataSht.Cells(RowCount, ExplosHazCol + 2) = c.Offset(0, 3).Value
End If

Set c = .Columns("A").Find(what:="EXPOSURE", LookIn:=xlValues,
lookat:=xlPart)
If c Is Nothing Then
MsgBox ("Error - Could not find Exposure for Chemical : " & Chemical)
Stop
Else
'find 2nd occurance
Set c = .Columns("A").FindNext(after:=c)

'Move Exposure
DataSht.Cells(RowCount, ExposureCol) = c.Offset(0, 1).Value
DataSht.Cells(RowCount, ExposureCol + 1) = c.Offset(0, 2).Value
DataSht.Cells(RowCount, ExposureCol + 2) = c.Offset(0, 3).Value
End If

Set c = .Columns("A").Find(what:="Inhalation", LookIn:=xlValues,
lookat:=xlPart)
If c Is Nothing Then
MsgBox ("Error - Could not find Inhalation Exposure for Chemical : " &
Chemical)
Stop
Else
'Move Inhalation Exposure
DataSht.Cells(RowCount, InhalCol) = c.Offset(0, 1).Value
DataSht.Cells(RowCount, InhalCol + 1) = c.Offset(0, 2).Value
DataSht.Cells(RowCount, InhalCol + 2) = c.Offset(0, 3).Value
End If

Set c = .Columns("A").Find(what:="Skin", LookIn:=xlValues, lookat:=xlPart)
If c Is Nothing Then
MsgBox ("Error - Could not find Skin Exposure for Chemical : " &
Chemical)
Stop
Else
'Move Skin Exposure
DataSht.Cells(RowCount, SkinCol) = c.Offset(0, 1).Value
DataSht.Cells(RowCount, SkinCol + 1) = c.Offset(0, 2).Value
DataSht.Cells(RowCount, SkinCol + 2) = c.Offset(0, 3).Value
End If

Set c = .Columns("A").Find(what:="Eyes", LookIn:=xlValues, lookat:=xlPart)
If c Is Nothing Then
MsgBox ("Error - Could not find Eyes Exposure for Chemical : " &
Chemical)
Stop
Else
'Move Eyes Exposure
DataSht.Cells(RowCount, EyesCol) = c.Offset(0, 1).Value
DataSht.Cells(RowCount, EyesCol + 1) = c.Offset(0, 2).Value
DataSht.Cells(RowCount, EyesCol + 2) = c.Offset(0, 3).Value
End If

Set c = .Columns("A").Find(what:="Ingestion", LookIn:=xlValues,
lookat:=xlPart)
If c Is Nothing Then
MsgBox ("Error - Could not find Ingestion Exposure for Chemical : " &
Chemical)
Stop
Else
'Move Ingestion Exposure
DataSht.Cells(RowCount, IngestCol) = c.Offset(0, 1).Value
DataSht.Cells(RowCount, IngestCol + 1) = c.Offset(0, 2).Value
DataSht.Cells(RowCount, IngestCol + 2) = c.Offset(0, 3).Value
End If

Set c = .Columns("A").Find(what:="SPILLAGE DISPOSAL", LookIn:=xlValues,
lookat:=xlPart)
If c Is Nothing Then
MsgBox ("Error - Could not find SPILLAGE DISPOSAL for Chemical : " &
Chemical)
Stop
Else
'Move SPILLAGE DISPOSAL
DataSht.Cells(RowCount, SpillDisposCol) = c.Offset(1, 0).Value
End If

Set c = .Columns("B").Find(what:="PACKAGING & LABELLING",
LookIn:=xlValues, lookat:=xlPart)
If c Is Nothing Then
MsgBox ("Error - Could not find PACKAGING & LABELLING for Chemical : "
& Chemical)
Stop
Else
'Move PACKAGING & LABELLING
Packaging = ""
TempRowCount = c.Row + 1
Do While .Range("B" & TempRowCount) <> ""
If Packaging = "" Then
Packaging = .Range("B" & TempRowCount)
Else
Packaging = Packaging & Chr(10) & .Range("B" & TempRowCount)
End If

TempRowCount = TempRowCount + 1
Loop
DataSht.Cells(RowCount, PackCol) = Packaging
End If

Set c = .Columns("A").Find(what:="EMERGENCY RESPONSE", LookIn:=xlValues,
lookat:=xlPart)
If c Is Nothing Then
MsgBox ("Error - Could not find EMERGENCY RESPONSE for Chemical : " &
Chemical)
Stop
Else
'Move EMERGENCY RESPONSE
Emergency = ""
TempRowCount = c.Row + 1
Do While .Range("A" & TempRowCount) <> ""
If Emergency = "" Then
Emergency = .Range("A" & TempRowCount)
Else
Emergency = Emergency & Chr(10) & .Range("A" & TempRowCount)
End If

TempRowCount = TempRowCount + 1
Loop

DataSht.Cells(RowCount, EmergRespCol) = Emergency
End If

Set c = .Columns("B").Find(what:="SAFE STORAGE", LookIn:=xlValues,
lookat:=xlPart)
If c Is Nothing Then
MsgBox ("Error - Could not find Safe Storage for Chemical : " &
Chemical)
Stop
Else
'Move Safe Storage
DataSht.Cells(RowCount, SafeStorCol) = c.Offset(1, 0).Value
End If

Set c = .Columns("A").Find(what:="Physical State; Appearance",
LookIn:=xlValues, lookat:=xlPart)
If c Is Nothing Then
MsgBox ("Error - Could not find Physical State; Appearance for
Chemical : " & Chemical)
Stop
Else
'Move Physical State; Appearance
Appearance = ""
TempRowCount = c.Row + 1
Do While .Range("A" & TempRowCount) <> ""
If Appearance = "" Then
Appearance = .Range("A" & TempRowCount)
Else
Appearance = Appearance & Chr(10) & .Range("A" & TempRowCount)
End If

TempRowCount = TempRowCount + 1
Loop
DataSht.Cells(RowCount, PhysStateCol) = Appearance
End If

Set c = .Columns("B").Find(what:="Routes of exposure", LookIn:=xlValues,
lookat:=xlPart)
If c Is Nothing Then
MsgBox ("Error - Could not find Routes of exposure for Chemical : " &
Chemical)
'Stop
Else
'Move Routes of exposure
Routes = ""
TempRowCount = c.Row + 1
Do While .Range("B" & TempRowCount) <> ""
If Routes = "" Then
Routes = .Range("B" & TempRowCount)
Else
Routes = Routes & Chr(10) & .Range("A" & TempRowCount)
End If

TempRowCount = TempRowCount + 1
Loop
DataSht.Cells(RowCount, RoutesCol) = Routes
End If

Set c = .Columns("A").Find(what:="Chemical dangers", LookIn:=xlValues,
lookat:=xlPart)
If c Is Nothing Then
MsgBox ("Error - Could not find Chemical dangers for Chemical : " &
Chemical)
Stop
Else
'Move Chemical dangers
Dangers = ""
TempRowCount = c.Row + 1
Do While .Range("A" & TempRowCount) <> ""
If Dangers = "" Then
Dangers = .Range("A" & TempRowCount)
Else
Dangers = Dangers & Chr(10) & .Range("A" & TempRowCount)
End If

TempRowCount = TempRowCount + 1
Loop
DataSht.Cells(RowCount, ChemDangCol) = Dangers
End If

Set c = .Columns("B").Find(what:="Inhalation risk", LookIn:=xlValues,
lookat:=xlPart)
If c Is Nothing Then
MsgBox ("Error - Could not find Inhalation risk for Chemical : " &
Chemical)
Stop
Else
'Move Inhalation risk
Inhalation = ""
TempRowCount = c.Row + 1
Do While .Range("B" & TempRowCount) <> ""
If Inhalation = "" Then
Inhalation = .Range("B" & TempRowCount)
Else
Inhalation = Inhalation & Chr(10) & .Range("A" & TempRowCount)
End If

TempRowCount = TempRowCount + 1
Loop
DataSht.Cells(RowCount, InhalRiskCol) = Inhalation
End If

Set c = .Columns("A").Find(what:="Occupational exposure limits",
LookIn:=xlValues, lookat:=xlPart)
If c Is Nothing Then
MsgBox ("Error - Could not find Occupational exposure limits for
Chemical : " & Chemical)
Stop
Else
'Move Occupational exposure limits
Occupational = ""
TempRowCount = c.Row + 1
Do While .Range("A" & TempRowCount) <> ""
If Occupational = "" Then
Occupational = .Range("A" & TempRowCount)
Else
Occupational = Occupational & Chr(10) & .Range("A" & TempRowCount)
End If

TempRowCount = TempRowCount + 1
Loop
DataSht.Cells(RowCount, OccupatCol) = Occupational
End If

Set c = .Columns("B").Find(what:="Effects of short-term exposure",
LookIn:=xlValues, lookat:=xlPart)
If c Is Nothing Then
MsgBox ("Error - Could not find Effects of short-term exposure for
Chemical : " & Chemical)
Stop
Else
'Move Effects of short-term exposure
ShortTerm = ""
TempRowCount = c.Row + 1
Do While .Range("B" & TempRowCount) <> ""
If ShortTerm = "" Then
ShortTerm = .Range("B" & TempRowCount)
Else
ShortTerm = ShortTerm & Chr(10) & .Range("A" & TempRowCount)
End If

TempRowCount = TempRowCount + 1
Loop
DataSht.Cells(RowCount, ShortTermCol) = ShortTerm
End If

Set c = .Columns("B").Find(what:="Effects of long-term or repeated
exposure", LookIn:=xlValues, lookat:=xlPart)
If c Is Nothing Then
MsgBox ("Error - Could not find Effects of long-term or repeated
exposure for Chemical : " & Chemical)
' Stop
Else
'Move Effects of long-term or repeated exposure
LongTerm = ""
TempRowCount = c.Row + 1
Do While .Range("B" & TempRowCount) <> ""
If LongTerm = "" Then
LongTerm = .Range("B" & TempRowCount)
Else
LongTerm = LongTerm & Chr(10) & .Range("A" & TempRowCount)
End If

TempRowCount = TempRowCount + 1
Loop
DataSht.Cells(RowCount, LongTermCol) = LongTerm
End If

Set c = .Columns("A").Find(what:="PHYSICAL PROPERTIES", LookIn:=xlValues,
lookat:=xlPart)
If c Is Nothing Then
MsgBox ("Error - Could not find PHYSICAL PROPERTIES for Chemical : " &
Chemical)
Stop
Else
'Move PHYSICAL PROPERTIES
Physical = ""
TempRowCount = c.Row + 1
Do While .Range("A" & TempRowCount) <> ""
If Physical = "" Then
Physical = .Range("A" & TempRowCount)
Else
Physical = Physical & Chr(10) & .Range("A" & TempRowCount)
End If

TempRowCount = TempRowCount + 1
Loop
DataSht.Cells(RowCount, PhysicPropCol) = Physical
End If

Set c = .Columns("B").Find(what:="ENVIRONMENTAL DATA", LookIn:=xlValues,
lookat:=xlPart)
If c Is Nothing Then
MsgBox ("Error - Could not find ENVIRONMENTAL DATA for Chemical : " &
Chemical)
Stop
Else
'Move ENVIRONMENTAL DATA
DataSht.Cells(RowCount, EnvironCol) = c.Offset(1, 0).Value
End If

Set c = .Columns("A").Find(what:="NOTES", LookIn:=xlValues, lookat:=xlPart)
If c Is Nothing Then
MsgBox ("Error - Could not find NOTES for Chemical : " & Chemical)
Stop
Else
'Move NOTES
Notes = ""
TempRowCount = c.Row + 1
Do While .Range("A" & TempRowCount) <> ""
If Notes = "" Then
Notes = .Range("A" & TempRowCount)
Else
Notes = Notes & Chr(10) & .Range("A" & TempRowCount)
End If

TempRowCount = TempRowCount + 1
Loop
DataSht.Cells(RowCount, NoteCol) = Notes
End If

End With

End Sub
 
R

ron

I use hyperlinks for chemicals that access a webpage for the individual
compounds showing tables of data for that chemical.

I have a large list of chemicals and I want to automate the downloading of
specific data in the tables from the web pages.

The chemical data is stored in the same format of tables on each webpage.

Can anyone help?

Thanks,

Roger

Do you always want to obtain the same info, just for a diffrerent
chemical? If so, specifically what info do you want?..ron
 
R

ron

Ron: Look at the webpage below.  There are over 5000 chemicals each with
different properties and warnings!

http://www.ilo.org/public/english/protection/safework/cis/products/ic...






- Show quoted text -

Yes, but depending what information is required, perhaps capturing it
a) from the source code or b) by obtaining specific tables might be a
reasonable solution. For example, Table 10 contains all of the
"Important Data" information.

Set cTables = ie.Document.getElementsByTagname("table")
s = cTables(10).innertext
Debug.Print s

and "s" then contains

IMPORTANT DATA
Physical State; Appearance
COLOURLESS LIQUID, WITH CHARACTERISTIC ODOUR.
Physical dangers
The vapour is heavier than air and may travel along the ground;
distant ignition possible.
Chemical dangers
The substance can form explosive peroxides on contact with strong
oxidants such as acetic acid, nitric acid, hydrogen peroxide. Reacts
with chloroform and bromoform under basic conditions, causing fire and
explosion hazard. Attacks plastic.
Occupational exposure limits
TLV: 500 ppm as TWA, 750 ppm as STEL; A4 (not classifiable as a human
carcinogen); BEI issued; (ACGIH 2004).
MAK: 500 ppm 1200 mg/m³ Peak limitation category: I(2); Pregnancy risk
group: D; (DFG 2006).Routes of exposure
The substance can be absorbed into the body by inhalation and through
the skin.
Inhalation risk
A harmful contamination of the air can be reached rather quickly on
evaporation of this substance at 20°C ; on spraying or dispersing,
however, much faster.
Effects of short-term exposure
The vapour irritates the eyes and the respiratory tract. The substance
may cause effects on the central nervous system, liver, kidneys and
gastrointestinal tract.
Effects of long-term or repeated exposure
Repeated or prolonged contact with skin may cause dermatitis. The
substance may have effects on the blood and bone marrow.

Again, depending upon which pieces of data for each chemical are
required, perhaps one of these approaches might prove useful...ron
 
R

ron

Yes, but depending what information is required, perhaps capturing it
a) from the source code or b) by obtaining specific tables might be a
reasonable solution.  For example, Table 10 contains all of the
"Important Data" information.

    Set cTables = ie.Document.getElementsByTagname("table")
    s = cTables(10).innertext
    Debug.Print s

and "s" then contains

IMPORTANT DATA
Physical State; Appearance
COLOURLESS LIQUID, WITH CHARACTERISTIC ODOUR.
Physical dangers
The vapour is heavier than air and may travel along the ground;
distant ignition possible.
Chemical dangers
The substance can form explosive peroxides on contact with strong
oxidants such as acetic acid, nitric acid, hydrogen peroxide. Reacts
with chloroform and bromoform under basic conditions, causing fire and
explosion hazard. Attacks plastic.
Occupational exposure limits
TLV: 500 ppm as TWA, 750 ppm as STEL; A4 (not classifiable as a human
carcinogen); BEI issued; (ACGIH 2004).
MAK: 500 ppm 1200 mg/m³ Peak limitation category: I(2); Pregnancy risk
group: D; (DFG 2006).Routes of exposure
The substance can be absorbed into the body by inhalation and through
the skin.
Inhalation risk
A harmful contamination of the air can be reached rather quickly on
evaporation of this substance at 20°C ; on spraying or dispersing,
however, much faster.
Effects of short-term exposure
The vapour irritates the eyes and the respiratory tract. The substance
may cause effects on the central nervous system, liver, kidneys and
gastrointestinal tract.
Effects of long-term or repeated exposure
Repeated or prolonged contact with skin may cause dermatitis. The
substance may have effects on the blood and bone marrow.

Again, depending upon which pieces of data for each chemical are
required, perhaps one of these approaches might prove useful...ron- Hide quoted text -

- Show quoted text -

Whoops, I checked a few more chemicals and found that Table 10 is not
always the "Important Data" table. Still, I'd like to see an answer
to my original question - what specific data fields does Roger want to
capture?..ron
 
J

Joel

Ron: It looks a little easier to get the data from the webpage your way than
mine. Your has an advantage of getting the formated data from the innerhtml
property.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top