微軟技術(shù)院士Dave Campbell:數(shù)據(jù)是用來幫助決策的
原創(chuàng)【51CTO專訪】在微軟,有一批叫做“技術(shù)院士”的人群。這是微軟內(nèi)部對于一群技術(shù)領(lǐng)導(dǎo)者的特別稱謂。每一名微軟的技術(shù)院士,在某項技術(shù)上的前瞻性、專長、以及其在業(yè)界的領(lǐng)袖地位,等同于微軟副總裁對業(yè)務(wù)的領(lǐng)導(dǎo)力。
前日,51CTO編輯有幸接觸到了這樣一位微軟技術(shù)院士,并跟他進(jìn)行了簡短的采訪。David Campbell先生在1994年加入微軟,當(dāng)時正逢微軟進(jìn)軍企業(yè)級軟件市場,Dave當(dāng)時的主要重心放在數(shù)據(jù)庫與存儲業(yè)務(wù)上。2008年開始,Dave將重心轉(zhuǎn)移到Azure、微軟云平臺和大數(shù)據(jù)戰(zhàn)略上,關(guān)注云規(guī)模計算、環(huán)境數(shù)據(jù)的價值實現(xiàn)等方面。
Dave此次來上海是以微軟亞太研發(fā)集團(tuán)STB中國團(tuán)隊的戰(zhàn)略顧問委員的身份,了解微軟亞太研發(fā)集團(tuán)在過去一年的工作狀態(tài),同時也交流微軟對于整個技術(shù)發(fā)展趨勢的觀點和看法。
“在微軟,我們針對大數(shù)據(jù)領(lǐng)域強(qiáng)調(diào)的是第四個V,就是價值。這些數(shù)據(jù)你是否能夠獲得它,并不是最重要的,重要的是你能夠從這個數(shù)據(jù)當(dāng)中獲得價值,這是最令人著迷的地方。”
那么,Dave對于大數(shù)據(jù)的價值究竟是如何定義的?請看下面的采訪實錄。
51CTO:Dave你好,感謝您接受我們的采訪!您剛才提到大數(shù)據(jù)的第四個V,價值,那么對于你們在不同行業(yè)的客戶,您是如何定義這個價值的?
Dave:大數(shù)據(jù)有極大的潛力。有一個很有意思的現(xiàn)象是,當(dāng)很多人談起大數(shù)據(jù)的時候,他們覺得那只跟互聯(lián)網(wǎng)相關(guān)。而實際上,我們有金融行業(yè),石油勘探,電子商務(wù)方面對交易的優(yōu)化,還有生命科學(xué),醫(yī)藥研究等等很多方面都與之相關(guān)。我們在醫(yī)藥領(lǐng)域已經(jīng)積累了十多年的藥物實驗數(shù)據(jù),還有化學(xué)元素之間反應(yīng)的各種數(shù)據(jù)。石油和自然資源方面,我們做了很多勘探優(yōu)化的工作。真的是非常多的領(lǐng)域。還有很重要的一個方面是,我們?nèi)绾卧谄髽I(yè)當(dāng)中增進(jìn)人們的工作效率?我們談到社交圖譜,企業(yè)內(nèi)部的社交圖譜。處在我這個崗位的其他人,是否在關(guān)注我所關(guān)注的這些信息,還是他們還關(guān)注別的東西?很多的行業(yè)當(dāng)中都有很多的機(jī)遇。
51CTO:不過當(dāng)我們談到傳統(tǒng)企業(yè)的時候,不是會需要用感應(yīng)器來收集數(shù)據(jù)么?
Dave:不一定啊。就好比企業(yè)內(nèi)部的社交圖譜吧,我們掌握的信息是整個企業(yè)的組織架構(gòu),員工們互相發(fā)送的郵件,或者在其他通訊工具上發(fā)送的信息,等等。都有哪些人在查看BI系統(tǒng)當(dāng)中的同一份報告?有了這些信息,可以得到許多有趣的見解。
51CTO:這些都是天生的數(shù)字化數(shù)據(jù)。
Dave:是的。整個數(shù)字世界的另一個方面是來自企業(yè)現(xiàn)有業(yè)務(wù)的數(shù)字內(nèi)容排放,這些數(shù)據(jù)現(xiàn)在都很容易收集到,只要觀察數(shù)據(jù)交換的行為,就可以看到誰在與誰進(jìn)行溝通。有很多可以做的事情。
51CTO:微軟是如何幫助客戶找到大數(shù)據(jù)對他們的價值的?
Dave:大數(shù)據(jù)目前的一個狀況還有點像是一群博士生鼓搗一堆軟件,弄很多數(shù)學(xué)論證之類的工作。不過最終,我們有很多人需要每天進(jìn)行各種決策。我們能幫助他們做出更好的決策么?有種說法叫做 From terrabyte to insight,即使你是用Excel做數(shù)據(jù)工作的,我們要做的都是使用這些數(shù)據(jù)來幫助我們做出更好的決策。
51CTO:您有沒有看到這些企業(yè)在使用數(shù)據(jù)的時候面臨一些限制的情況?
Dave:我個人覺得,最大的限制在于他們需要一些人去“負(fù)學(xué)習(xí)”,才能看到大數(shù)據(jù)的新價值。比如說,很多企業(yè)都建立部署了數(shù)據(jù)倉庫。他們可能會覺得,這就是我的大數(shù)據(jù)。這只是一個版本的真相。我覺得在這個新的時代,真相是有很多種版本的。數(shù)據(jù)倉庫的數(shù)據(jù)是一個版本,它們?nèi)匀皇怯袃r值的。不過真相還有另一個版本,就是社交網(wǎng)絡(luò)里的人們在如何談?wù)撃愕漠a(chǎn)品。這個版本也是非常重要的。還有一個版本,就是我之前說的,企業(yè)的數(shù)字內(nèi)容排放。所以,機(jī)會有很多,最大的限制在于有沒有人能夠后退一步,看到各個版本的真相。
51CTO:最近我們看到一個新的趨勢叫做“數(shù)據(jù)商店”。這有點像是以前那種行業(yè)報告的形式,不過現(xiàn)在我們得到的是實時的數(shù)據(jù)服務(wù)。您對這種服務(wù)有什么看法?人們消費(fèi)數(shù)據(jù)的方式發(fā)生了怎樣的變化?
Dave:就個人級而言,20年前人們?nèi)绻P(guān)注金融,比如股市或期貨,他們通過看報紙獲取信息。今天,金融信息都通過網(wǎng)絡(luò)獲取。幾年前,我跟一家大型航空公司的人聊天,他們提到想要優(yōu)化燃油的使用,或者是在風(fēng)暴襲來之前,是否要將飛機(jī)從機(jī)場轉(zhuǎn)移到安全的地方。仔細(xì)想想,他們要從哪里獲取這些信息?他們要從哪里得到這種天氣預(yù)測的模型,從而決定是否要將飛機(jī)轉(zhuǎn)移?這其實涉及到一個數(shù)據(jù)流動和分析的問題。數(shù)據(jù)服務(wù)將會變得非常非常重要。除了上面說的那種大規(guī)模的服務(wù),我認(rèn)為針對企業(yè)內(nèi)部的服務(wù)更加重要。在微軟,我們談“自助BI”談?wù)摿撕芏嗄辍=o人們工具,讓他們自己得出見解并做出決策。那其實只是一部分,現(xiàn)在人們會要求更多的信息來源,不僅是企業(yè)內(nèi)的數(shù)據(jù),還有企業(yè)外部的數(shù)據(jù),來幫助我對我這部分的業(yè)務(wù)形成好的見解。
51CTO:所以,這些數(shù)據(jù)從哪里獲取呢?
Dave:這就是我們現(xiàn)在面臨的挑戰(zhàn)了。我們的工作之一就是讓它們變得可以訪問、可以搜索到。之前我提到的內(nèi)部社交圖譜,很重要的一點是,處在我這個位子上的其他人,他們都在使用哪些信息源?有點像是數(shù)據(jù)服務(wù)的推薦引擎。這會讓信息更容易被找到,更容易收集。
下一頁是采訪的英文內(nèi)容實錄。
#p#
51CTO: Hi Dave, thanks for taking our interview! Our first question is, how do you define big data's value to your customers in various industries?
Dave: There is a lot of potential in big data. One of the interesting things we are finding is that many people think about big data is that it is only applicable to Internet scale services. We are seeing a lot of people taking up value, like financial services, like oil detection, digital commerce, optimization of commerce transaction, another state is life science, pharmaceuticals, many cases we've worked out are looking over decades of clinical trial data, to try to find interaction between compounds that we already have to be used in a way. And finally, oil and gas, natural resources, optimization of exploration as well. Really a variety of them. One of the most interesting ones, I think, is that how do we improve productivity of working people in businesses? There is a lot of talks around, such as the social graph, interesting social graphs within businesses. Do other people like me in similar roles consume the same information, lot of opportunities in lot of industries.
51CTO: So when you talk about traditional industries, you need sensors to collect data?
Dave: Not necessarily. If you think about the social graphs inside business. One has a directory inside business that we know how people are related from the organizational structure, and we think about the emails that people send to one another, or other forms of communication, there is really a variety of things. If you look at which people look at the same report in the BI system, we can come up with some very interesting insights.
51CTO: So that's the data that's born to be digital.
Dave: Yes. The other part of the visual thing is the digital exhaust from existing business part. It's quite easy to collect that now. You can just look at who is interacting who, based upon the digital exchanges. There is lot that we can do.
51CTO: How is Microsoft helping customers in realizing the potential of big data, especially for different industries?
Dave: One of the things about big data thus far, is about PhDs working with software who stick together, many of the algebras and proving. But at the end of the day, there are lots of people who are making lots of decisions. Can we help them to make better decisions? Sometimes we are talking this going from terrabytes to insights, we find that a lot of people are just working in tools like Microsoft Excel, how do we get it all the way from sitting in the big data cluster to looking at something and make a better decision.
51CTO: So we need analysis tools to do that.
Dave: Yes.
51CTO: Do you see any constraints those companies are facing? They want to make value of their data, but they are meeting some constraints?
Dave: I think honestly one of the biggest constraints is that they need to have people unlearn things to be able to see big data's new value. I'll give you one example. There are a lot of people who have deployed large scale data warehouses. And they think, that is my big data. That is one version of the truth. I think that in this new world, there are really multiple versions of the truth. The version of the truth you are having in your data warehouse, that's still valuable. But there is a version of the truth about what people are saying in the social space about your product. If you are not paying attention to that, what the social is saying on the web, that is the version of truth that is also important. There is also another version of the truth that can be derived from what I said before, the digital exhaust from existing business part. There is really a lot of opportunity, and one of the constraints if to have people to step back and see the fact that there is great potential for this.
51CTO: Cool. So recently we are having a new thing coming up, called the Data Store. It is something like reports from live data services. What do you think about this kind of services? And how people consume data has changed?
Dave: On a personal level, if you look at people who were tracking finances, stock or equity 20 years ago, they would look at the newspaper. And today there are a lot of web services to get financial data. A couple of years ago, I was talking to a large airline. They wanted to be able to optimize fuel usage, or should there be a storm coming in, should they move their planes to safe areas from the airport. And if you think about where they would get these information, where are they going to get the weather model to tell whether they should move their planes or not, all of this is just about moving data and being able to transform, interprete. Data services will become a very very important thing to do. And that serves at large scale. I think it's more within a business. We have been talking about it for a few years, Microsoft calls it self service BI. Let people have the tools to be able to derive their own insights and make decisions. That is not very well, but now what people ask is where are the information pieces, whether it's data services within the business that allow me to look at the table things together, that give me insights from my parts of the business.
51CTO: How do you find these data, now?
Dave: Today it is a challenge. Some of the technologies we are working on is just to make them available and allow them to be searched. We spoke earlier about the internal graph, imagine, other people in my role, what information source do they use? It's like a recommendation engine for data services. It makes information easier to find and easier to collect.