SharePoint 2010 and Adobe PDF

SharePoint does not do crawl PDFs out of the box.. here is how to get it to do it.

  1. Download and install Adobe’s 64-bit PDF iFilter*1 http://www.adobe.com/support/downloads/detail.jsp?ftpID=4025

  2. Download the Adobe PDF icon (select Small 17 x 17)  http://www.adobe.com/misc/linking.html

  3. Give the icon a name or accept the default: ‘pdficon_small.gif’

  4. Save the icon (or copy to) C:\Program Files\Common Files\Microsoft Shared\Web Server Extensions\14\TEMPLATE\IMAGES

  5. Edit the DOCICON.XML file to include the PDF icon

  6. In Windows Explorer, navigate to C:\Program Files\Common Files\Microsoft Shared\Web Server Extensions\14\TEMPLATE\XML

  7. Edit the DOCICON.XML file (I open it in NotePad, you can also use the built-in XML Editor)

  8. Ignore the section <ByProgID> and scroll down to the <ByExtension> section of the file

  9. Within the <ByExtension> section, insert <Mapping Key=”pdf” Value=”pdficon_small.gif” /> attribute. The easiest way is to copy an existing one – I usually just copy the line that starts <Mapping Key=”png”… and replace the parameters for Key and Value (see image below)

  10. Save and close the file

  11. Add PDF to the list of supported file types within SharePoint

  12. In the web browser, open SharePoint Central Administration

  13. Under Application Management, click on Manage service applications

  14. Scroll down the list of service apps and click on Search Service Application

  15. Within the Search Administration dashboard, in the sidebar on the left, click File Types

  16. Click ‘New File Type’ and enter PDF in the File extension box. Click OK

  17. Scroll down the list of file types and check that PDF is now listed and displaying the pdf icon.

  18. Close the web browser

  19. IISRESET twice

  20. Perform a full crawl of your index. Note: An incremental crawl is not sufficient.

The Quick and Easy PS Way

Thanks to Johan Skoglund, what a legend, what a champion, link

# Configure-pdf-search.ps1
# johan@dgk.se, 2010-12-28
# http://itbloggen.se/cs/blogs/josko/
#
# This script adds the functionality to crawl and index content of pdf files in SharePoint 2010
# The script will:
# - download and install iFilter 9 x64 from Adobe
# - download and install the pdf icon from Adobe (including adding the pdf icon file to the docicons.xml file)
# - add the pdf extension to the list of documents to be indexed to the Sharepoint 2010 Search Application
# - register pdf filter for SharePoint Search in the registry
# - restart the SP search service and IIS
#
# The script should be run on all indexing servers in the sharepoint farm to install the iFilter
# The account that runs this script needs to be SP Farm administrator and to have administrator privileges on all servers in the farm
# The script is tested and works fine on Windows Server 2008 R2 64-bit
#
# The Configure PDF iFilter for SharePoint 2010 process is documented in many places including:
# http://www.sharepointsharon.com/2010/03/sharepoint-2010-and-adobe-pdf/
# http://www.codeproject.com/KB/sharepoint/PDFiFIlterSharePoint2010.aspx
# http://www.adobe.com/support/downloads/detail.jsp?ftpID=4025
# http://www.adobe.com/special/acrobat/configuring_pdf_ifilter_for_ms_sharepoint_2007.pdf
# configuration
$tempfolder = "C:\temp"

# create the temp folder if it doesnt exist
Get-Item $tempfolder -ErrorVariable err -ErrorAction "SilentlyContinue" | Out-Null
if ([String]::IsNullOrEmpty($err) -eq $false){
 new-item -type directory -path $tempfolder -ErrorVariable err -ErrorAction "SilentlyContinue" | Out-Null
 $err = ""
}
function RestartIIS(){
 $title = "Restart IIS"
 $message = "Do you want to restart the local IIS?"
 $yes = New-Object System.Management.Automation.Host.ChoiceDescription "&Yes","Restarts IIS."
 $no = New-Object System.Management.Automation.Host.ChoiceDescription "&No","Silently continues..."
 $options = [System.Management.Automation.Host.ChoiceDescription[]]($yes, $no)
 $result = $host.ui.PromptForChoice($title, $message, $options, 0)
 if ($result -eq 0){
 iisreset /noforce
 }
}

function download{
 # usage: download http://url c:\temp
 param([string]$URL, [string]$destination)
 Write-Output ""
 Write-Output "Downloading $URL ..."
 $clnt = new-object System.Net.WebClient -ErrorVariable err -ErrorAction "SilentlyContinue"
 $clnt.DownloadFile($url,$destination)
 if ([String]::IsNullOrEmpty($err) -eq $true) { Write-Output " - Download completed."}
 else { Write-Error "Download ERROR - Check URL: $err" }
}

function Extract-Zip {
 # usage: extract-zip c:\demo\myzip.zip c:\demo\destination
 # originally from http://blogs.msdn.com/b/daiken/archive/2007/02/12/compress-files-with-windows-powershell-then-package-a-windows-vista-sidebar-gadget.aspx
 param([string]$ZIPname, [string]$destination)
 $ZIPfile = Get-Item $ZIPname -ErrorVariable err -ErrorAction "SilentlyContinue" # gets the file as an object
 if ([String]::IsNullOrEmpty($err) -eq $false) {
 Write-Error "ERROR: $err Cannot find $ZIPname !!!"
 EXIT
 }
 $ZIPfolder = Get-Item $destination -ErrorVariable err -ErrorAction "SilentlyContinue" # gets the folder as an object
 if ([String]::IsNullOrEmpty($err) -eq $false) {
 Write-Error "ERROR: $err Cannot find $ZIPfolder !!!"
 EXIT
 }
 ELSE{
 $zipname = $zipfile.fullname # makes sure the path is absolute
 $zipDestination = $ZIPfolder.fullname # makes sure the destination path is absolute
 $shellApplication = new-object -com shell.application
 $zipPackage = $shellApplication.NameSpace($zipname)
 $destinationFolder = $shellApplication.NameSpace($ZIPdestination)
 $destinationFolder.CopyHere($zipPackage.Items())
 }
}

function AddSystemPaths([array] $PathsToAdd) {
 # originally from http://blogs.technet.com/b/sqlthoughts/archive/2008/12/12/powershell-function-to-add-system-path.aspx
 $VerifiedPathsToAdd = ""
 foreach ($Path in $PathsToAdd) {
 if ($Env:Path -like "*$Path*") {
 echo " $Path already in the path"
 }
 else {
 $VerifiedPathsToAdd += ";$Path";
 }
 }
 if ($VerifiedPathsToAdd -ne "") {
 echo "Adding $VerifiedPathsToAdd to system path"
 [System.Environment]::SetEnvironmentVariable("PATH", $Env:Path + "$VerifiedPathsToAdd","Machine")
 }
}

function ConfigurePDFSearch {
Add-PSSnapin Microsoft.SharePoint.PowerShell -ErrorAction "SilentlyContinue" | Out-Null
 $farm = get-spfarm
 Download "http://www.adobe.com/images/pdficon_small.gif" "$tempfolder/pdf.gif"
 foreach($Server in $farm.servers){ # connecting to all application servers in the farm
 if (($Server.Role -eq "Application") -and ($Server.Status -eq "Online")){
 Write-Output ""
 Write-Output ("Copies the PDF icon to the sharepoint folder on " + $Server.Name + "...")
 $DestFile = "\\" + $Server.name + "\c$\Program Files\Common Files\Microsoft Shared\Web Server Extensions\14\TEMPLATE\IMAGES\pdf.gif"
 copy-item "$tempfolder\pdf.gif" -destination $DestFile -ErrorVariable err -ErrorAction "SilentlyContinue"
 if ([String]::IsNullOrEmpty($err) -eq $true) { Write-Output " - Copy operation completed."}
 else { Write-Error "Copy ERROR: $err" }
 }
 }
 Write-Output ""
 Write-Output "Adds PDF to the list of search extensions in the Search Appliation..."
 $searchApp = Get-SPEnterpriseSearchServiceApplication
 if ([String]::IsNullOrEmpty($err) -ne $true) { Write-Error "Error: Search Application is missing / not created yet : $err" }
 $PDFcheck = get-SPEnterpriseSearchCrawlExtension "pdf" -SearchApplication $searchApp -ErrorVariable err -ErrorAction "SilentlyContinue" | Out-Null
 if ([String]::IsNullOrEmpty($PDFcheck) -eq $true){
 new-SPEnterpriseSearchCrawlExtension "pdf" -SearchApplication $searchApp -ErrorVariable err -ErrorAction "SilentlyContinue" | Out-Null
 if ([String]::IsNullOrEmpty($err) -eq $true) { Write-Output " - Add completed."}
 else { Write-Error "Error: $err" }
 }
 Else{
 Write-Output " The PDF extension was already in the list"
 }
 foreach($Server in $farm.servers){ # connect to all web front ends in the farm
 if (($Server.Role -eq "Application") -and ($Server.Status -eq "Online")){
 Write-Output ""
 Write-Output ("Adding pdfs as extension to docicons xml file on " + $Server.name)
 $XMLfile = "\\" + $Server.name + "\c$\Program Files\Common Files\Microsoft Shared\Web Server Extensions\14\TEMPLATE\XML\DOCICON.XML"
 [xml]$dociconxml = get-content $XMLfile -ErrorVariable err -ErrorAction "SilentlyContinue"
 if ([String]::IsNullOrEmpty($err) -eq $true) {
 $PNGelement = $dociconxml.DocIcons.ByExtension.Mapping | Where-Object { $_.Key -eq "png" }
 $PDFnode = $dociconxml.DocIcons.ByExtension.Mapping | Where-Object { $_.Key -eq "pdf" }
 if ($PDFnode.key -eq "pdf"){
 write-output " - XML document was already updated."
 }
 Else{ # add a new pdf node to the xml document
 $element = $dociconxml.DocIcons.ByExtension.Mapping[0].clone() # Duplicates an existing node
 $element.key = "pdf"
 $element.value = "pdf.gif"
 $element.OpenControl = ""
 $element.EditText = ""
 $dociconxml.DocIcons.ByExtension.InsertBefore($element,$PNGelement) | Out-Null # Inserts the new node before the existing PNG element
 $dociconxml.save($XMLfile)
 if ([String]::IsNullOrEmpty($err) -eq $true) { Write-Output " - XML updated."}
 else { Write-Error "Update ERROR: $err" }
 }
 }
 else { Write-Error "XML wasnt found: $err" }
 }
 }
 Download "http://download.adobe.com/pub/adobe/acrobat/win/9.x/PDFiFilter64installer.zip" "$tempfolder\PDFiFilter64installer.zip"
 write-output " - Unzipping file..."
 extract-zip "$tempfolder\PDFiFilter64installer.zip" $tempfolder
 write-output ""
 write-output "Running the PDF iFilter installer..."
 $proc = Start-Process C:\Windows\System32\msiexec.exe " /passive /i $tempfolder\PDFFilter64installer.msi" -wait -ErrorVariable err -ErrorAction "SilentlyContinue"
 if ($LASTEXITCODE -eq "0"){
 Write-Output " - OK" }
 else{
 Write-Output " - Probably OK (Installation returned error code: $LastExitCode)" }

 write-output ""
 write-output "Adding the pdf dll path to system path..."
 AddSystemPaths("C:\Program Files\Adobe\Adobe PDF iFilter 9 for 64-bit platforms\bin\")

 write-output ""
 write-output "Adding pdf entrys for Sharepoint Search in the registry..."

 New-Item -path registry::'HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office Server\14.0\Search\Setup\Filters\.pdf' | Out-Null
 New-ItemProperty -Path registry::'HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office Server\14.0\Search\Setup\Filters\.pdf' -Name "Extension" -value ".pdf" -PropertyType string | Out-Null
 New-ItemProperty -Path registry::'HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office Server\14.0\Search\Setup\Filters\.pdf' -Name "Mime Types" -value "application/pdf" -PropertyType string | Out-Null
 New-ItemProperty -Path registry::'HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office Server\14.0\Search\Setup\Filters\.pdf' -Name "FileTypeBucket" -value "1" -PropertyType dword | Out-Null

New-Item -Path registry::'HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office Server\14.0\Search\Setup\ContentIndexCommon\Filters\Extension\.pdf' | Out-Null
 New-ItemProperty -Path registry::'HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office Server\14.0\Search\Setup\ContentIndexCommon\Filters\Extension\.pdf' -name "(Default)" -Value "{E8978DA6-047F-4E3D-9C78-CDBE46041603}" -PropertyType string | Out-Null

write-output "Re-register the adobe iFilter dll..."
 regsvr32.exe "C:\Program Files\Adobe\Adobe PDF iFilter 9 for 64-bit platforms\bin\PDFFilter.dll"

# Finally, issue an IISReset and restart the sharepoint search service
 RestartIIS
 Write-Output "Restarting the Search Service..."
 Stop-Service "OSearch14"
 Start-Service "OSearch14"

 $exitprompt = Read-Host "Configuration complete. Press ENTER to exit"
}

ConfigurePDFSearch

write-output "Done!"

0 views0 comments

Recent Posts

See All