2006年11月28日

グーグル、ベルギーのジャーナリストおよびカメラマンの所属団体と和解

http://japan.cnet.com/news/biz/story/0,2000056020,20330127,00.htm

グーグル、ベルギーのジャーナリストおよびカメラマンの所属団体と和解

文:Elinor Mills(CNET News.com)
2006/11/28 10:23

 Bloomberg Newsが報じたところによると、「Google News」のリンクに関する著作権をめぐり、裁判で争っていたGoogleとベルギー人ジャーナリストおよびカメラマンらが和解したという。
 同記事には、「Googleが料金を支払わずベルギーの新聞記事にリンクするのを阻止しようとした裁判で和解が成立し、原告となっていた5団体のうち2団体が訴えを取り下げた。Googleの広報担当Jessica Powell氏は、3700名のカメラマンが所属する著作権管理団体Sofamと、ジャーナリスト組織Scamとの間nに締結した和解案の詳細については、コメントを拒否した」と記されている。
 ベルギーで発行されているフランス語およびドイツ語の新聞を代表する諸団体がGoogleを提訴したのは、2006年2月のことだ。9月には、裁判所がGoogleにこれらの新聞に対するリンクをGoogle Newsから削除するよう命じ、従わなければ罰金を科すことを決めた。Googleはこの裁定に応じている。もっとも、現在本件は再審理が行われており、先週末の聴聞会では、2007年1月初頭に判決が下される見込みだと判事が話している。
 またAgence France-Presseも、同様の著作権裁判をGoogleに対して起こした。
 一方Microsoftは、提訴される前に、ベルギー系新聞へのリンクを削除することに同意している。
posted by gljblog at 00:00| その他のGoogle | このブログの読者になる | 更新情報をチェックする

フランスの制作会社、Google を著作権侵害で提訴

http://japan.internet.com/wmnews/20061128/12.html

フランスの制作会社、Google を著作権侵害で提訴

Google (NASDAQ:GOOG) や YouTube をはじめとする動画共有サイトは、怒れる著作権保有者からの訴訟を回避しつつ、自由なコミュニティの気風を守れるのか? このほどヨーロッパで新たに訴訟が発生したことで、そうした疑問が高まっている。
ドキュメンタリ映画を制作する Flach Film は23日、フランス版『Google Video』が、同社の映画『The World According to Bush』を4万3000回以上にわたって無料ダウンロードさせ、フランスの著作権法に違反したとして、パリの商事裁判所に提訴した。「Flach Film は裁判所に、この違法行為から生じた損失の補償を Google に命じるよう求めている」と、同社は声明で述べている。
一方、提訴された Google によれば、問題の動画は通知を受けて直ちに削除したという。「しかるべき権利を持たないユーザーが動画をアップロードすることは、Google Video の使用条件に違反する」と同社広報は述べている。
今回の訴訟だけではない。大きな期待を集める動画共有サイト市場の前途には、法的問題という暗雲が立ち込めつつある。Google は10月、YouTube を16億5000万ドルで買収すると発表したが、その署名のインクも乾かないうちから、続々と訴訟が起きている。Google は、今月初めに米証券取引委員会 (SEC) に提出した書類の中でも、著作権侵害の提訴を抱えていると報告している。その直後、YouTube に投稿されたフランスのドキュメンタリ作品をめぐる新たな訴訟が明らかになったが、同社広報はこの時、同訴訟について「サイト上に短期間存在していた1本の動画をめぐる小さな訴訟」と片付けていた。
posted by gljblog at 00:00| その他のGoogle | このブログの読者になる | 更新情報をチェックする

2006年11月22日

欧州委員会、欧州デジタル図書館に向けてのタイムテーブルを各国に提示

http://www.dap.ndl.go.jp/ca/modules/car/index.php?p=2582

欧州委員会、欧州デジタル図書館に向けてのタイムテーブルを各国に提示
2006年11月22日(水曜日)

欧州委員会の教育・青少年・文化評議会が11月13日、EU各国に対し欧州デジタル図書館に向けて力を結集することを求める決定を行いました。あわせて、欧州デジタル図書館に向けて優先的に行うべき作業のタイムテーブルを提示しています。

Press Release
2762nd Council Meeting
Education, Youth and Culture
Brussels, 13-14 November 2006
http://europa.eu.int/information_society/activities/digital_libraries/doc/culture_council/council_13_11_2006.pdf
(※10〜16ページが該当の箇所です。)
posted by gljblog at 00:00| その他の蔵書デジタル化計画 | このブログの読者になる | 更新情報をチェックする

欧州デジタル図書館の著作権サブグループ、中間報告を発表

http://www.dap.ndl.go.jp/ca/modules/car/index.php?p=2581

欧州デジタル図書館の著作権サブグループ、中間報告を発表
2006年11月22日(水曜日)

欧州デジタル図書館のHigh Level Expert Group下の著作権サブグループは、2006年6月から欧州デジタル図書館構築に関する著作権問題を分析・検討していますが、このほどその中間報告が発表されました。高次の原則、デジタル保存、Orphan Works、絶版作品などに関して、EU各国に対する提案がなされています。

European Digital Library Initiative
High Level Expert Group (HLG) ? Copyright Subgroup Interim Report(16.10.06)
http://europa.eu.int/information_society/activities/digital_libraries/doc/minutes_of_hleg_meet/copyright_subgroup/interim_report_16_10_06.pdf
High Level Expert Group
http://europa.eu.int/information_society/activities/digital_libraries/cultural/actions_on/consultations/hleg/index_en.htm
posted by gljblog at 00:00| その他の蔵書デジタル化計画 | このブログの読者になる | 更新情報をチェックする

Google Book Searchにズームなどの新機能

http://www.itmedia.co.jp/news/articles/0611/22/news081.html

Google Book Searchにズームなどの新機能

本のページを縦に並べて、スクロールでページをめくれる1ページモードも追加された。
2006年11月22日 18時02分 更新

 米Googleは11月21日、書籍検索サービス「Google Book Search」に新機能を加えたことを明らかにした。

 今回追加されたのは、虫眼鏡のアイコンをクリックしてテキストや画像を拡大・縮小表示するズーム機能と、本のページを縦に並べて1ページに収め、スクロールでページをめくれる1ページモード。書籍によっては、紙の本を開いたときのように、見開きでページを表示する2ページモードもある。

 また、画面右上の「Full screen」アイコンをクリックすると、書籍のページをブラウザの画面いっぱいに表示することができる。さらに、「Summary」の下の「About this book」リンクをクリックすると書籍の詳細な情報が表示される。ここにはユーザーが関心を持っている本の関連書籍や、その本に言及しているほかの書籍や学術論文の情報が掲載されている。
posted by gljblog at 00:00| Google Book Searchプロジェクト | このブログの読者になる | 更新情報をチェックする

スクロールでページがめくれる「Google Book Search」の新UI公開

http://internet.watch.impress.co.jp/cda/news/2006/11/22/14022.html

スクロールでページがめくれる「Google Book Search」の新UI公開

 米Googleが書籍検索サービス「Google Book Search」のユーザーインターフェイスを大幅に改良したことが21日までに明らかになった。

 新しいユーザーインターフェイスでは、書籍のページ画面を縦にスクロールしていくだけでページが次々に読み込まれていく。これまでのように次のページを示す矢印をクリックする必要がないのは非常に便利だ。PDFを画面で表示している感覚だ。また、書籍表示画面をフルスクリーンに拡大できるるとともに、ズーム機能を使って文字の拡大・縮小も行なえる。字が細かい書籍を閲覧する時などには、フルスクリーンモードで文字を拡大するなど機能を組み合わせると、非常に利便性が高まる。

 Google Library Projectによって登録された書籍の場合には、本を見開き状態で読むことが可能だ。この場合、画面上部のアイコンをクリックすると元のスクロール画面に戻すこともできる。見開き状態で本を見る場合は普通の本を読んでいる感覚で利用できる。ページをめくる場合には右矢印をクリックする方法だけでなく、キーボードのスペースキーやPage Up、Page Downキーを押すことによってもめくれる。その場合には自動的に新しいページが読み込まれるため、キーを押していくだけで次々にページがめくれるという極めて読みやすいユーザーインターフェイスとなった。

 もう1つ重要な新サービスとして書籍に関する詳細な情報を提供する「About This Book」ページが用意された。これは書籍を表示している画面の右欄上部にある「About This Book」あるいは「More about this book」リンクをクリックすると表示できる。

 このページには書籍に関する概略の説明だけでなく、関連書籍一覧や書籍の中で引用されている書籍へのリンク、書籍の中で多用されるキーワードの一覧のほか、書籍の中から何ページかを選び出して書籍の様子を見ることができる。これらの情報にはサードパーティによる情報も含まれているという。このページを見ると、その書籍に関する必要な情報を一度に調べることができるという優れものだ。もちろんここから米Amazonなどのオンライン書店で購入することもできるし、その書籍が利用できる図書館を検索することも可能だ。Googleはこのページを生成するためにアルゴリズム的手法を採用して自動生成しているため、現時点ではこのページに表示される情報を要望に応じて変更することはできないという。

2006/11/22
posted by gljblog at 00:00| Google Book Searchプロジェクト | このブログの読者になる | 更新情報をチェックする

Google Book Searchにページ送り等の新機能

http://www.dap.ndl.go.jp/ca/modules/car/index.php?p=2584

Google Book Searchにページ送り等の新機能

Google Book Searchに、拡大/縮小機能、ページ送り機能、関連図書・引用文献の表示機能が追加されました。
posted by gljblog at 00:00| Google Book Searchプロジェクト | このブログの読者になる | 更新情報をチェックする

2006年11月21日

京都大学で、電子図書館に関する国際会議開催(ICADL2006)

http://www.dap.ndl.go.jp/ca/modules/car/index.php?p=2573

2006年11月21日(火曜日)
京都大学で、電子図書館に関する国際会議開催(ICADL2006)

11月27日〜30日にかけてICADL2006(アジアデジタルライブラリ会議)が京都大学で開かれます。Google Book Searchのディレクターの講演があるほか、特別セッションとしてシンガポール国立図書館、国立台湾大学、インドネシア大学等の電子図書館サービスについてのセッションが組まれています。

ICADL2006(アジアデジタルライブラリ会議)
http://www.icadl2006.org/index-jp.html

posted by gljblog at 00:00| Google Book Searchプロジェクト | このブログの読者になる | 更新情報をチェックする

A new way to browse books - Official Google Blog

http://googleblog.blogspot.com/2006/11/new-way-to-browse-books.html

A new way to browse books
11/21/2006 06:51:00 PM

Posted by Nathan Naze, Software Engineer

As a kid, I was a bit of a fixture at my hometown library. My mom and I would visit frequently and the librarians knew me by name. It's only fitting that now, decades later, I work as an engineer for Google Book Search, Google's project to make the world's books searchable, just like the web.

My latest assignment has been to help develop a better way to browse our digitized books on a computer screen. I've always had an interest in cutting-edge web applications ? existing Google products such as Gmail, Google Maps, and Google Docs & Spreadsheets make heavy use of JavaScript and DHTML to create full-featured applications in a web browser that you can use without having to download and install anything.

In an effort to make online book reading easier, we've given our product the same treatment. I'm tremendously excited to announce the first fruits of these efforts. Here's a quick tour of some of the changes:

Zoom in on text and images. Here's a cool full-page sketch of a ship from an 1898 book on steam navigation. Looking for something less dated? Perhaps this colorful page of a room from a book on interior design. Want a better look? You can now zoom in and out ? just click on the and buttons. Play with it until you find a size you like.

One book, one web page. No more reloads! In one-page mode (just click the button), pages appear one below the other, like a scroll of paper. For full-view books, there's also a two-page mode () in which pages appear side by side, just like in a physical book (perfect for two-page images). In both modes, you'll be able to use and to turn pages.

Scroll, scroll, scroll your book… using the scrollbar or your mouse wheel, or by dragging (in most browsers, you'll see a ). You can also use the keyboard (try the spacebar, page up, page down, and the arrow keys). Or you can click on a link in the table of contents or your search results to jump right to that page (like this photo from the 1906 book Geronimo's Story of His Life).

This page was made for reading. We've tried to tidy up the clutter to leave as much room as possible for what's important ? the book. We've put all the information about the book in a scrollable side menu. Still not enough room? You can put the screen in fullscreen mode with , so you can use the whole window for browsing. Try it with a nice illustrated book of Celtic fairy tales or, for some lighter reading, electromagnetic wave theory.

More on this (and other) books. Find other books that interest you. Just click on "About this book" to find more books related to the book you're reading. If the book How to Draw Comic Book Heroes and Villains interests you, you'll probably like Comic Book Artist Collection, Vol. 1. We also revised our "About this book" page to provide better information for in-copyright books, from which you can just see short snippets or a limited preview.

Explore citations and references. You can also find other books that refer to your book of interest. If scholarly works from Google Scholar have references to the book, you'll see them too. As an example, see what other works have referred to Aristotle's works or the 1922 book All About Coffee.
So check out the new Google Book Search. We hope it'll help you find new (and old) books that interest you. Try it out, and let us know what you think.

posted by gljblog at 00:00| Google Book Searchプロジェクト | このブログの読者になる | 更新情報をチェックする

2006年11月16日

古地図の活用−Google Earthに古地図のレイヤー追加

http://www.dap.ndl.go.jp/ca/modules/car/index.php?p=2555

古地図の活用−Google Earthに古地図のレイヤー追加
2006年11月16日(木曜日)

Googleが提供する「Google Earth」に、古地図データが追加されました。このデータは、地図の個人収集家David Rumsey氏から提供されたもので1680年の東京、1710年のアジアの古地図も含まれています。

http://www.google.com/press/pressrel/earth_awareness.html
http://googleblog.blogspot.com/2006/11/old-world-meets-new-on-google-earth.html

David Rumsey Map Collection
http://www.davidrumsey.com/

参考:
Google Earthで世界中の古地図を見よう(CNET)
http://www.dap.ndl.go.jp/ca/modules/car/wp-admin/post.php
「Google Earth」に地図収集家の古地図コレクションなど追加(INTERNET WATCH)
http://internet.watch.impress.co.jp/cda/news/2006/11/14/13926.html
進化する地図の世界(CA1607)

posted by gljblog at 00:00| その他のGoogle | このブログの読者になる | 更新情報をチェックする

Google Books Library Projectにバージニア大学図書館も参加

http://www.dap.ndl.go.jp/ca/modules/car/index.php?p=2554

Google Books Library Projectにバージニア大学図書館も参加
2006年11月16日(木曜日)

Google Books Library Projectの9番目の参加機関として、バージニア大学が加わることが発表されました。

The University of Virginia Library Joins the Google Books Library Project
http://www.google.com/press/annc/books_uva.html

U.Va. Library Joins the Google Books Library Project
http://www.virginia.edu/uvatoday/newsRelease.php?id=1053
posted by gljblog at 00:00| Google Book Searchプロジェクト | このブログの読者になる | 更新情報をチェックする

2006年11月15日

バージニア大学、GoogleのBooks Libraryに参加

http://www.itmedia.co.jp/news/articles/0611/15/news019.html

バージニア大学、GoogleのBooks Libraryに参加

米国建国の父のひとり、第3代大統領トーマス・ジェファーソンが設立したバージニア大学図書館も、Googleの書籍検索プロジェクトに参加する。
2006年11月15日 07時57分 更新

 米Googleは11月14日、書籍検索プロジェクト「Google Books Library Project」に、新たに米バージニア大学が参加すると発表した。同大学の図書館はトーマス・ジェファーソン大統領が設立、米国建国時の蔵書や資料の豊富さで知られる。

 同図書館は拠点であるバージニア州ロタンダを中心に13カ所に支所を持ち、500万冊を超える蔵書、1700万以上の原稿、稀少本、デジタル文書を所有。Googleは図書館が所有する歴史、文化、人文科学関連の蔵書の一部をデジタル化する。

 パブリックドメインにある同大学の書籍については、誰もが自由に検索、閲覧可能となる。著作権がある書籍については、「Book Search」機能で基本情報(書籍名、著者名など)のみが見られる。その本を入手可能な場所や借りられる図書館などについての情報も表示される。

 Googleの書籍検索プロジェクトには、既に米国議会図書館、ハーバード大学、ニューヨーク公立図書館、カリフォルニア大学、ミシガン大学、オックスフォード大学、スタンフォード大学、ウィスコンシン大学マディソン校などがパートナーとして参加、蔵書のデジタル化を進めている。
posted by gljblog at 00:00| Google Book Searchプロジェクト | このブログの読者になる | 更新情報をチェックする

2006年11月14日

米Google,地理週間を記念してGoogle Earthに古地図を追加

http://itpro.nikkeibp.co.jp/article/USNEWS/20061114/253586/

米Google,地理週間を記念してGoogle Earthに古地図を追加

 米Googleは,地理週間を記念して同社の衛星写真と3次元(3D)画像の地図表示ソフトウエア「Google Earth」の新しい特集コンテンツ「Featured Content」として古地図を追加した。同社が米国時間11月13日に発表した。
 今回追加したコンテンツは,地図コレクタのDavid Rumsey氏が個人で所有する古地図のコレクションをデジタル化したもの。1680年の東京,1733年の北米,1892年のブエノスアイレスなど16種類の古地図をGoogle Earthで閲覧できる。同コンテンツにより,これらの地域が発展していった様子や人々の地理に対する認識の変化を知ることができるという。
 11月13日から始まった地理週間は,地理学への理解を深める運動として米地理学協会(National Geographic Society)が推進するもの。今回の地理週間はアフリカ地域に焦点を当てている。Googleは,同協会と協力して,アフリカ地域の地理に関する知識をテストできるインタラクティブなクイズも提供する。
 Rumsey氏は,「Google Earthの最新技術により,地球の地理的な歴史が革新的な方法で紹介されるのはすばらしいことだ。何百年も前にこれらの地図を作成した地図製作者は,自分の作品がGoogle Earthで紹介されているのを見たら驚くとともに喜ぶだろう」とコメントしている。

(ITpro)  [2006/11/14]
posted by gljblog at 00:00| その他のGoogle | このブログの読者になる | 更新情報をチェックする

U.Va. Library Joins the Google Books Library Project

http://www.google.com/intl/en/press/annc/books_uva.html

U.Va. Library Joins the Google Books Library Project

Nov. 14, 2006 -- Today, Google welcomes its newest partner - the University of Virginia Library - to the Google Books Library Project. Built by Thomas Jefferson, one of the founding fathers of the United States, the U.Va. Library carries a wealth of early American historical material among its rich collections.

Google will digitize hundreds of thousands of books from the Library, including selected portions of the Library's American history, literature, and humanities works collections, and make them searchable online through Google Book Search. With 13 physical locations as well as the original Rotunda, the Library contains more than five million volumes, 17 million manuscripts, rare books and archives, and rapidly-growing digital collections.

For scholars and readers all over the world, this offers even more access to the great works of history and culture. By simply searching online, researchers across the globe can discover books held on the shelves of the U.Va. Library, including a broad range of materials from American literature to Buddhist studies.

"This is an historic moment," said University President John T. Casteen III. "When Jefferson designed the University, he placed the library at its center -- both physically and academically. Reading and the quest for knowledge were all-important to him. Reaching out into the world -- what we now call Globalization -- was central to his vision of what an American university must do to promote the knowledge that sustains personal freedom. To have the library that is the clearest single emblem of this vision now assume a role in a vast, international digital library has special meaning here. It puts a distinctly contemporary meaning to our founder's dream of making knowledge accessible to all people."

Anyone will be able to freely view, browse and read U.Va.'s books in the public domain. For books protected by copyright, scholars searching on Book Search will be able to see the basic background of relevant books (such as the title and the author's name), and at most a few lines of text related to their search. They can also find information about where they can buy or borrow a book.

The University of Virginia becomes the latest partner in the Google Books Library Project, which also includes the University of California, Harvard University, University Complutense of Madrid, University of Michigan, the New York Public Library, Oxford University, Stanford University and the University of Wisconsin-Madison. Google is also conducting a pilot project with the Library of Congress.

The Google Books Library Project digitizes books from major libraries around the world and makes their collections searchable on Google Book Search. More information can be found at: http://books.google.com .

Also see today's blog post on the Google Book Search about the announcement: http://booksearch.blogspot.com/.

posted by gljblog at 00:00| Google Book Searchプロジェクト | このブログの読者になる | 更新情報をチェックする

2006年11月13日

Google、YouTube 投稿ビデオの著作権侵害訴訟を一蹴

http://japan.internet.com/wmnews/20061113/12.html

Google、YouTube 投稿ビデオの著作権侵害訴訟を一蹴

Google (NASDAQ:GOOG) が、投稿動画サイトの最大手 YouTube を買収すると発表してから1か月経ったが、投稿動画サイトを巡る訴訟件数は増加しつつある。そして直近の事例は、フランスで制作されたドキュメンタリが YouTube に投稿された件に関する19万2465ドルの賠償請求訴訟だ。しかし Google の広報担当者によると、同社はこの請求を「一時的に掲載された1本の動画に関する些細な訴訟」としてはねつけたという。同社はこの案件が明らかになる前、米国証券取引委員会 (SEC) に対して8日に7-9月期決算報告書を提
出しているが、同報告書の法的問題に関する部分で、YouTube の買収に伴い、さらなる著作権関係の訴訟を受ける可能性があると述べていた。
Google の広報担当者 Ricardo Reyes 氏は今回の訴訟について、YouTube は著作権侵害の苦情に対し、迅速に対処することを約束した様々な手段を講じているため、「Google や多くの Web ホスティング企業と同じく、セーフハーバー条項に該当する」と述べた。同氏によると、Google と YouTube は「要求があり次第、速やかに対象物を削除するとの確固たる方針を持っている」という。上述した決算報告書にある通り、Google は YouTube の買収に伴い、このような法的問題が起きることは予見していた。Google による買収の準備だったのか、単に広告主の信頼獲得が狙いだったのかは分からないが、YouTube は買収発表の同日およびその直前に、商業コンテンツに関するライセンス契約を、複数のメディア大手と結んでいる。
9日に閉幕したイベント『Web 2.0 Summit』の席上で、Google の CEO (最高経営責任者) Eric Schmidt 氏は、同社が将来起こりうる訴訟に備えて、秘密資金を準備しているとの話を否定した。その代わりGoogle は、同社および YouTube が法的追求を受けることのないよう、コンテンツ企業と精力的に交渉を続けているという。

posted by gljblog at 00:00| その他のGoogle | このブログの読者になる | 更新情報をチェックする

2006年11月07日

Peter's Digital Reference Shelf : Google Book Search

Peter's Digital Reference Shelf
November 2006

Title: Google Book Search (a.k.a. GooglePrint)
Publisher: Google, Inc.
URL: http://books.google.com
Cost: Free
Tested: Continuously

The Context
Google Book Search (GBS), launched in 2004 under the name Google Print, is the most controversial project of the many beta releases of Google, Inc.? mostly for the Google Books Library Project module. I skip the legal and/or ethical pros and cons in the case; there are many substantial sources to let you see both sides of the coin, and legal cases are pending. Here is an excellent bibliography by Charles Bailey. I focus on what is the current content; what is accessible; and how the software helps and prevents finding materials. Only a small segment of the books and other print materials seem to be available free in their entirety. For this column, I approach it primarily from the ready reference perspective, where even snippets of information can be useful.

Google has an unusually extensive background page about Google Book Search (but without any factual information about the size and composition of the database). It is full of success stories and happy testimonials. They are mostly from users who believe that the concept of digitizing books and making them full-text searchable is yet another innovation by Google, Inc. These happy users apparently have lived in the Google bubble, ignorant about other alternatives.

The eBook idea first appeared in the early 1970s, when Michael Hart started the Gutenberg Project to scan pages and convert them into plain-text format public domain documents. By now there are 19,700 eBooks in Project Gutenberg. By today’s standard this is a relatively small amount, but these items can be displayed and/or printed in their entirety (although the typography is plain and ugly ASCII text, not a facsimile of the books). It is dwarfed by the beautiful American Memory multimedia super collection of historical materials. Its creation started in the early 1990s (15 years before Google Print), and now has more than 9 million items. It has 465 items about the impeachment of Andrew Johnson alone.

The Million Books project is another mega database that started long before Google Print was conceived.

There are several, relatively small but worthy eBook collections that are free to search and display the full text of books, such as the small scholarly book collection of the National Academies Press or the free subset of ebrary with about 30,000 books. For further information, see Nicholas Tomaiuolo’s well-updated and annotated list of e-text collections and the Open Directory Project section on the topic as implemented by Google.

One of the most prominent pioneers of the Web era, Amazon's, Look Inside The Book (LIB), then Search Inside The Book (SIB) features must have been the obvious inspiration for Google Book Search (GBS). The SIB subset of Amazon has about 280,000 fully searchable books. Many of these are greatly enriched by extra information, such as book reviews from professional journals, information about the authors, citing and cited references as I discussed in my review.

The Software
I almost always discuss the software at the end of the review, but here I must make an exception and bring up serious software problems that confuse even veteran searchers, and distort or make enigmatic some results. Even with simple searches, there is enough confusion because of the ignorance, illiteracy and innumeracy of the software.

Boolean search
The most startling problem is the incorrect use of the Boolean OR operation, the simplest of all. It is taught in kindergarten that the search for A OR B cannot produce less results than the higher found for A or B. Still, the query aboulia produces 26 items, abulia yields 40, but aboulia OR abulia produces only 35.

Neither can a search for A OR B produce more hits than the sum of the hits found for A and B together at most. But this is what happens as illustrated by this simple search: for books with the word arrogance in the title. It finds 2 books. The search for books with the word arrogant in the title finds 6 documents. (Minutes earlier the software produced 8 hits, and such disappearances add an additional dimension to the confusion). The search for books with arrogant OR arrogance in the title yields 13 books.

This is surprising, as there could not be more than 8 books. The first page of the list shows books with the word arrogance in the title that were not shown when searching for that word. The same is true for arrogant. This may explain the result of the OR operation but then keeps the user wondering why those extra books were retrieved only for the Boolean OR operation.

Using limit fields
Most search programs make it easy to limit the search to the title field, the publication year and some other fields. Google serves up strange results even for the simple title search, ignoring obviously matching hits. Searching for the term Google in the title yields two hits. When you search for the word anywhere, the first 12 of the 28 hits show books where the term appears in the title. For perspective: Amazon has 23 fully searchable books with the word Google in the title.

Use of the date limit is also a letdown. It seems absurd that GBS has only 55 partially viewable books published in 2006. Amazon has 15,152. To its credit, GBS has 25 fully viewable books, but it is a small consolation.

Split results
The handling of fully viewable books is inconsistent in GBS, and therefore the results are unpredictable. Sometimes they are included in the All Books search, sometimes not; sometimes some of the fully viewable books are included in the All Books search, but not the others. The search for the word fundamentalism in the title yields 8 hits in the All Books list and 3 in the fully viewable result list. None of the latter appears in the former.

The search for the term ignorance returns 91 hits in the All Books result list, and 66 in the Full View result list. Four of the first five hits in the latter appear also in the All Books search result, but none of the other 62. Practically, if you want a comprehensive search you must repeat the search in both domains. This is very irritating. The simple query form should have check boxes to accommodate the user preferences for content type, and to make the result list consistent and predictable.

Confusing hit counts
It certainly discombobulates the users when hits are reported in terms of pages rather than books. When searching about the macaque monkey, however, 26 pages are reported in the result list. Actually 26 represents the number of books listed, not the number of pages. The first two books (with a total of more than 1,000 pages) are dedicated to the topic of the social behavior of macaque monkeys. The search term obviously must appear on hundreds of pages in those two books, so the number of pages should be much higher than the number of books.

Using the search cell within the page of the first matching page shows that there are 30 pages where the search word occurs and are viewable. This is clearly the number of pages that GBS allows the user to view, not the number of pages on which the search term appears, let alone the total number of occurrences of the search word.

Even more enigmatic is the result list header on the first page of the search for the word arrogance which says Books 1-10 with 4110 pages on intitle:arrogant OR intitle:arrogance. What is that score? The total number of pages in the books? Not likely, and it would not be relevant anyhow. The total number of hits matching the word arrogant or arrogance in the books? That could be useful, but why it is shown only when there are more than 10 hits for the query? Why does it disappear when you get to the end of the result list? Why is it not shown when you set the num= parameter to higher than the default 10 hits per page?

The search for publisher Houghton Mifflin produces a list that claims 10,100,000 (yes, ten million one hundred thousand) pages as hits. By the time you scroll down the list, it settles for 53 books ? and 53 pages.

The header on the top of the short result list should offer much better information, reporting that there are X number of occurrences of macaque, on Y number of pages in N books. There are Z number of pages which can be displayed.

The scanning process brings its own oddities. It caught my attention that in the search for the word ignorance there is an item authored by Plea, and the title starts with “A plea for strengthening …”. I just wondered why the letter A was not misinterpreted as the initial of the first name of Mr. Plea. I could not imagine why Haydn’s dictionary from 1883 came up for my search for tsunami in dictionaries, when the word was not even used in that year. It turns out that the name of a Turkish pasha, Osman, was considered to be a match. In fairness, Amazon also has odd results for scanning reasons, and Google has a much more difficult task scanning materials from centuries earlier. About 95% of the books in the SIB collection are less than 30 years old, in my estimate.

These problems are not nearly as lethal in this database as in Google Scholar, which has very similar deficiencies, and is used by some too-enthusiastic scientists in various disciplines. They take the hit counts and the citation scores reported by Google Scholar without checking their plausibility, then feed the numbers to their programs, which diligently churn out many useless statistical measures. They give a publisher an embellished pseudo-scholarly paper based on often inflated hit counts and phantom citations, and these papers are cited, exciting other researchers. You can find examples for the serious problems of Google Scholar, and the puppy love attitude of serious researchers, in a PowerPoint presentation for the closing session of the UKSG conference, and in a paper published in Online Information Review.

The Content
GBS includes eBooks converted from scanned print publication format and books received directly from the publishers in digital format. Character recognition in the scanning process is never 100% accurate, but the ratio of scanning errors was small in my samples (as it is in Amazon). Even in most of those cases, the context made clear for the naked eye what the original word may have been. Of course, for searching purposes these words are lost, as they are not matching the query term. However, if the word appears more than once in the text, the book is still retrieved, and if the word appears more than once on the same page and at least once correctly, the specific page will also show up in the results.

Database composition
GBS offers four content viewing options. The most generous is the full view option that allows thumbing through the entire book as well as downloading the books in PDF format. Books that are in the public domain have this option, or if the copyright holder asked Google to make them viewable without restriction, as is the case with the 2001 edition of the nearly 300 page book in the Daytrips series about Hawaii [daytrips]. There are no equivalent to this category yet in Amazon.

Copyright holders mostly choose the limited view option when only about 20-25% of the pages can be viewed and downloading/printing are disabled. Still, they can be very informative for getting a feel about the content, style and format of the book, to decide if the book is worth buying, borrowing or requesting through interlibrary loan. You can read reviews about the spectacularly illustrated Concise Animal Encyclopedia, but taking a glance at a picture or two of this book is, indeed, worth a thousand words of reviews.

The limited view option is not that too limiting for those who just need some factual information about a person, a place, an event or a concept. For example, the Best Beaches of Hawai’i book is just perfect in this format for getting concise information.

The index page shows one page for Lanikai, which turns out to be the first page of a three-page sub-section, and you can read it through from page 19 through page 20 to page 21. You can go fishing for another beach in the table of contents, which is usually available in its entirety for most books even in limited view, and pick another beach name for the next query, then jump to the appropriate page shown in the sidebar of the search result page.

The snippet view option has very restricted viewing options, just a paragraph from a few pages at best which include your search terms. This still could be useful for a ready reference question, such as the meaning of a word, especially when it is a geographic name (usually not included in many general dictionaries), and a gazetteer would not provide the meaning. Occasionally, there are books that appear both as no preview and snippet view types.

It is another question if the source defines the term correctly. In ready reference, corroboration of the information is crucial, but can be time consuming. In the example above, heavenly shore for Lanikai is a tad loose translation. One of the beauties of GSB is that even the snippets might give a hint, than clicking on an adjacent entry might reconfirm or contradict the information. In this search result the entry right above the entry with the snippet view happens to be an excerpt from the book Hawai’i Place Names, and it provides a much more informative and credible piece of information about the meaning of the name of the beach of the small town.

The most restrictive option provides only the usual bibliographic data, but no preview. It is still useful, as at least you would know that your search term occurs somewhere in the book ? except when it does not. Searching for my last name, for example brings back books, which includes Jacson instead of Jacso. Of course, you don’t know about such mistakes if there is no preview.

Database size
It would be useful to know the proportion of books in each category discussed above. Google does not provide any quantitative information about the database itself, or such details as the ratio of books in the different categories.

As is usual with Google services, it is not possible to determine through special searches how many items there are in the database, or get factual information about other aspects of the content, such as the distribution of items by publication year (at least by broad range, such as for the last decade).

There is a publication year range cell on the advanced template, but it is like a prop in the cheap B-movies. It does not work if you touch it. For example, the search for books published in the past 10 years which include the word “love” anywhere in the body of the text, yields an implausibly low number of 18 hits from GBS.

Oprah used to recommend more than that between two commercial breaks. The Amazon SIB subset for books published in the past 10 years that include the word “love” anywhere in the body of the text yields 191,178 hits. It’s a reasonable number that would please all reading club members and talk-show participants. Extending the time span to more than 500 years the hit number makes the result in GBS increase by 3 to 21.

If the subject word is dropped to find out how many books there are in GBS published between 1496 and 2005, the hit number goes up to 59. That would be pathetic even in the eye of those bloggers who get instantly infatuated with any Google service without really testing them.

Because of the crippling software limitations, the best alternative approach may be to compare results from GBS with Amazon’s SIB subset for the semantically equivalent (but sometimes syntactically different) queries, without using date limitation or more advanced but often dysfunctional query combinations and filters which would guarantee to leave GBS in the dust.

Database sources
My samples have shown that not only books, but all kinds of printed materials, such as pamphlets, are present in the database; from every time period, in every genre. Sometimes, odd items show up in result list, which are certainly not books, but journals, whose GBS records were apparently created from the journal title list of Ebsco, and ProQuest (which are described as authors), or publishers’ catalog of books.

Unfortunately, it is impossible to estimate, let alone to determine their absolute numbers. As for the scope of publishers, the biggest names have submitted books in digital format for inclusion, including both university presses, such as Oxford, Cambridge, Princeton, Chicago and, to a lesser extent, commercial publishers, such as Penguin, Springer and Houghton Mifflin. From the perspective of ready reference, encyclopedias, dictionaries, almanacs, and factbooks are the most important traditional sources. Limiting the search to one of these words in the title, showed a good variety of ready reference works with definition and/or description for the term I searched for.

Even more importantly, non-reference books can now serve as ready reference sources by virtue of searching the entire body of text of all kinds of books. Occasionally, a quick search in GBS can return a wealth of ready reference information for a question which classical dictionaries, encyclopedias, and almanacs don’t answer.

Results of test searches
A search for the definition or description of affluenza yields no result from any of the following dictionaries American Heritage, Chambers, Collins, Cambriidge American English, Longman Contemporary English, Merriam-Webster (10 th and 11 th and unabridged editions), Oxford Concise, Compact Oxford, any of the dictionaries in the Oxford Reference Online suite, and Wordsmyth. Only Oxford English Dictionary had a definition with sample citations.

In contrast, GBS finds 29 books where the word appears. Actually, the first one is a book titled Affluenza ? dedicated to the topic. Even the snippets shown on the result list might give the answer, or take you directly to the answer in the book.

With that said, Amazon shows its superiority not only by bringing up the same book (although only as the 9th hit) but also 290 other books in which the word appears. It also includes reviews from Booklist, Library Journals, and several other review publications incorporated in the master record), and offers many other informative features, including links to 116 other books cited by Affluenza.

Searches by the name of 15 publishers showed big differences between Amazon SIB collection and GBS. The latter came up better only for O’Reilly and the University of Hawaii Press with 36 versus 10, and 37 versus 3, respectively).

In the rest, Amazon was incomparably better, as illustrated by university presses such as Oxford (7,045 vs 57), Cambridge (11,445 vs 53), University of Chicago (2,923 vs 43), Princeton (2,193 vs 48), as well as commercial publishers Houghton Mifflin (736 vs 56), Blackwell (3114 vs 61), Penguin (2090 vs 16), Springer (13,138 vs 65), Taylor and Francis (1,565 vs 52), or McGraw-Hill (4,210 vs 34).

The hit numbers in GSB fluctuated somewhat during my test. I did not reduce hits because of false drops like matching author name appearing in publisher field for Taylor & Francis, for snippet view and no preview records) These numbers may not include the 200 or so full view books offered by the publishers. As the difference is two orders of magnitude, it was not worth the effort to check how many of those are included in the All Books counts, and how many are indeed unique, and thus to be added. It is a laudable feature of GBS but does not change the picture. I hope that this low number of items from the largest publishing partners of Google is just a software failure not shallow content. Publishers could easily run some tests on their titles.

As far as the legally undisputable clean subset of GBS is concerned, it very well complements Amazon’s SIB. Time and again I found top notch, ready reference sources in GBS with limited preview option which are not searchable through Amazon’s SIB subset. There are many comments on the GDB site by some Google-smitten bloggers about GBS. Most of them sound like those in the midnight commercials by exuberant housewives finding their true love in a laundry detergent or sink cleaning gizmo. Google prominently quoted from Tom Bruno’s Jersey Exile blog, but should not take at face value what Tom, a library assistant at Harvard University, wrote (Google's search capabilities beat the pants off of its competitor [Amazon]. Google Print also doesn't muddle the results of its searches by trying to sell you unrelated stuff conjured up by your keyword searches in Amazon). Beyond simple keyword searching, Google’s software seems to be cognitively challenged, to put it nicely, and hinders access to the content, which would deserve at least a functional and half as smart software as Amazon has.

Opinions expressed in this review do not necessarily reflect the opinions of Thomson Gale, its employees or affiliates. We cannot guarantee the accuracy of information contained in non-Thomson Gale sites.

posted by gljblog at 00:00| Google Book Searchプロジェクト | このブログの読者になる | 更新情報をチェックする

レファレンスツールとしてのGoogle Book Searchの評価

http://www.dap.ndl.go.jp/ca/modules/car/index.php?p=2504

レファレンスツールとしてのGoogle Book Searchの評価

先日売却方針が発表された米Thomson Galeが一般公開しているサービスの中に、ハワイ大学情報・コンピュータ学部図書館情報学プログラムのピーター・ヤチヨ(Dr. Peter Jacso)教授が、オンラインまたはパッケージで提供されるデジタル情報資源について、毎月1〜2点を取り上げてレファレンスツールとしての評価を行う連載があります。2006年11月の連載では、注目を集めているGoogle Book Searchが取り上げられています。Google Book Searchは検索機能に難があるものの、一部分の表示だけでも他のレファレンス資料にはない情報を得られることができ、AmazonのSearch Inside the Book(日本語版では「なか見!検索」)同様、急ぎのレファレンスに有益だろうとのことです。

なお、ブログ“ResourceShelf”が、このピーター・ヤチヨ教授の記事をさらにレビューしています。

Google Book Search - Peter’s Digital Reference Shelf
http://www.gale.com/reference/peter/googlebooks.htm
Peter’s Digital Reference Shelf
http://www.gale.com/reference/peter/
(※これより前のバックナンバーも、左側のArchivesから検索できます。)

November 7, 2006付けResourceShelfの記事
http://www.resourceshelf.com/2006/11/07/8767/
posted by gljblog at 00:00| Google Book Searchプロジェクト | このブログの読者になる | 更新情報をチェックする

広告


この広告は60日以上更新がないブログに表示がされております。

以下のいずれかの方法で非表示にすることが可能です。

・記事の投稿、編集をおこなう
・マイブログの【設定】 > 【広告設定】 より、「60日間更新が無い場合」 の 「広告を表示しない」にチェックを入れて保存する。


×

この広告は1年以上新しい記事の投稿がないブログに表示されております。